Expert C# 5.0: with .NET 4.5 Framework (2013)

CHAPTER 11 Collections Explained

This chapter will discuss the different collection types in .NET—Array, List<T>, ArrayList, Stack, Queue, Hashtable, and Dictionary—which are used for storing and managing data in an application. We will look at the internal workings of the List<T> class and how the CLR instantiates an instance of theList<T> class, how it adds items into it, and how it expands its internal array to accommodate more items. We will also examine the internal workings of the ArrayList, Stack, Queue, Hashtable, and Dictionary classes to see how CLR handles these classes to store information.

Collections in .NET

The System.Collections namespace in the .NET contains data structures that can be used in C# to define different kinds of collection types. All collection classes implement the interface ICollection with additional functionality. Figure 11-1 shows the different collection classes in the .NET.

images

Figure 11-1. Different collections class used in .NET

Each of these classes has its own functionality and usage. Table 11-1 summarizes the different types of collection shown in Figure 11-1.

images

Array Class

The Array class is the implicit base class for all single and multidimensional arrays, and it is one of the most fundamental types in implementing the standard collection interfaces. The Array class provides type unification, so a common set of methods is available to all arrays, regardless of their declaration or underlying element type.

Array Class in .NET

The Array class provides methods for creating, manipulating, searching, and sorting arrays. It serves as the base class for all arrays in the CLR. It has been defined in the mscorlib.dll (C:\Windows\Microsoft.NET\ Framework\v4.0.30319) assembly in .NET as shown in Figure 11-2.

images

Figure 11-2. Array class in the System namespace

The signature of the Array class is:

public abstract class Array : ICloneable, IList, ICollection, IEnumerable

The Array class is an abstract type, which implements the ICloneable, IList, ICollection, and IEnumberable interfaces.

Array Class Members

The Array class has different public properties and methods that can be used to manipulate data stored in the Array class. In the next section, we will explore the different properties and methods of the Array class.

Properties

Table 11-2 lists the different properties of the Array class.

images

Methods

The Array class has different methods that can be used to do different operations. Table 11-3 lists the different methods of the Array class.

images

List

The List<T> class used in .NET represents a collection of objects. This class exposes different methods and properties used to manipulate the objects stored in it. This section will examine the List<T> class.

Capacity and Size of the List<T>

The List<T> is a dynamic array whose size increases as required. The number of elements that List<T> can hold is referred to as the capacity of the List<T> class. If the initial capacity (number of items it can hold) of the List<T> is N, then on every phase of List<T> expansion to accommodate more items, the CLR adds extra N empty slots in the internal array of the List<T> or increases to 2N. Figure 11-3 demonstrates the size and expansion of the List<T> class.

images

Figure 11-3. List<T> basics

When you instantiate the List<T> class without specifying the capacity, the initial capacity of the List<T> is zero. On the first Add operation to the List<T> instance, it will increase the capacity with the default capacity being four (N), which is stored in the internal field _defaultCapacity of the List<T>class. The capacity of the List<T> will be increased when needed to accommodate more items in the List<T> instance—such as 4(N) × 2 = 8 (2N), 8(N) × 2 = 16 (2N), 16(N) × 2 = 32 (2N), and so on. It means on every expansion of the List<T> class, a new capacity of the List<T> will be calculated and it will be two times that of the previous capacity. The actual items occupied in the List<T> are referred by the Count property (internally the _size field) of the List<T> class, and the total items can be added to the List<T> referred by the Capacity property. The value of the capacity is updated when the capacity of the List<T> is expanded to accommodate more items. For example, if you consider an instance of the List<T> class whose current capacity is eight (refers by the Capacity property) but it actually contains five items, then the Count property of that instance of the List<T> refers five (internally the _size field) and the Capacity is eight. The capacity of the List<T> can also be set:

· Explicitly by setting the Capacity property

· In the instantiation time of the List<T>

· When the List<T> class is instantiated based on another list of type ICollection, where the size of that list is used as the initial capacity of the List<T> class

The capacity of the List<T> can be decreased by calling the TrimExcess method. If an instance of the List<T> has the capacity N (where N = 4) and only two items have been stored in that instance of the List<T>, calling the TrimExcess method of the CLR will decrease the capacity of the List<T> to two by removing N-2 cells from the internal array of the List<T> class.

List Declaration

The List<T> class is defined in the System.Collections namespace of the mscorlib.dll (C:\Windows\Microsoft.NET\ Framework\v4.0.30319) assembly in .NET as demonstrated in Figure 11-4.

images

Figure 11-4. List<T> class in the System.Collections.Generic namespace

The signature of the List<T> class would be:

public class List<T> : IList<T>, ICollection<T>, IEnumerable<T>, IList, ICollection, IEnumerable

Let’s look at an example where the List<T> class has been used to create an instance of the numbers object, as shown in Listing 11-1.

Listing 11-1. Example of the List<T> Class

using System;
using System.Collections;
using System.Collections.Generic;

namespace Ch11
{
class Program
{
static void Main(string[] args)
{
List<int> numbers = new List<int>();
numbers.Add(0);
numbers.Add(1);
ShowResult(numbers);
}
public static void ShowResult(IEnumerable aList)
{
foreach (var item in aList)
Console.Write("{0}\t", item);
Console.WriteLine();
}
}
}

This program will produce the output:

0 1

Instantiation of the List<T>

The List<T> comes to life when you call one of the following three overloaded constructors:

public List() {}
public List(int capacity) {}
public List(IEnumerable<T> collection) {}

When the CLR executes the first version of the List<T> constructor, it initializes the internal _items array with the default size zero, but on the first Add operation, it will be increased to four. In the second version of the constructor, it initializes the _items array with a given size provided as the parameter capacity. The CLR gets the size from the parameter collection via the Count property of the input collection. For the third version, the constructor initializes the internal _items array and copies the entire contents of the collection into the _items array. Figure 11-5 demonstrates the instantiation of the List<T> class.

images

Figure 11-5. List<T> class and its constructors

Addition Operation in the List<T>

To add any item into the List<T>, the CLR checks whether the item can be stored in the internal _items array. The CLR ensures this by checking the number of the elements in the _items array of the List<T> using the EnsureCapacity method of the List<T> class. If the size is ensured by the EnsureCapacitymethod, the CLR adds that item (which has to be same type because the T refers to the List<T>) to the _items array. Figure 11-6 shows the basic Add and AddRange operations of the List<T> class.

images

Figure 11-6. Addition operation in the List<T>class

The EnsureCapacity method does the entire back-end job to make sure that the List<T> is able to hold the required number of data (based on the available memory) in the _items array. The CLR calls this method with the value of the current _size field of the List<T> + 1. The EnsureCapacity method calculates the new capacity for the _items array and sets this new capacity value into the Capacity property of the List<T> class. The implementation of the EnsureCapacity method of the List<T> class is shown in the Listing 11-2.

Listing 11-2. The Implementation of the EnsureCapacity Method of the List<T> Class

/* min = (current size (value of the _size) of the List<T> + 1) */
private void EnsureCapacity(int min)
{
/* The _items.Legth refers to the current total length of the _items
* array or total cells of the _items array where as _size field of
* the List<T> class refers to the currently used cells from the
* _items array of the List<T> class. To able to add any new item into
* the List<T>, _size has to be < _items.Length */
if (this._items.Length < min)
{
int num = (this._items.Length == 0) ? 4 : (this._items.Length * 2);
if (num > 0x7fefffff)
{
num = 0x7fefffff;
}
if (num < min)
{
num = min;
}
this.Capacity = num;
}
}

The Capacity property of the List<T> class creates a new array using the new capacity value (num from Listing 11-2) as the total length of this new array. It copies the existing contents of the _items array to this new array. The new array will contain all the existing items and four new empty cells for storing extra items. Finally, this new array will replace the existing _items array. Let’s examine the implementation of the Capacity property, as shown in Listing 11-3.

Listing 11-3. The Implementation of the Capacity Property of the List<T> Class

public int Capacity
{
get
{
return this._items.Length;
}
set
{
/*.....*/
/* value refers to the new Capacity value ensured by the
* EnsureCapacity method.*/
if (value > 0)
{
/* Temporary array to hold existing item and new empty items.*/
T[] destinationArray = new T[value];
if (this._size > 0)
{

/* Copy the existing items into the new array. */
Array.Copy(
this._items, /* It refers the source array from
* where items will be copied */
0, /* The position of the source array
* from where the copy will start */
destinationArray, /* It refers the destination array in
* where items will be copied into*/
0, /* The position of the destination array
* in where the item will be
* placed after copying*/
this._size /* Total number of item will be copied
* from the source array.*/
);
}

/* Replace _items with the temporary array which currently holding
* existing contents of the _items array and empty items
* for the expanded cells. */
this._items = destinationArray;
}
else
{
/* Otherwise set _items with 0 items as _emptyArray hold 0 items. */
this._items = List<T>._emptyArray;
}
/*.....*/
}
}

Insert Operation in the List<T>

The InsertRange method works a bit differently in comparison to the Add method. The InsertRange method adds a series of items into the List<T>, starting from a given position of the _items array. From Listing 11-4, you can see that the values {22, 33, 77} are inserted by copying the existing items {3}from the numbers into a new location of the numbers. It started copying items from position 3 of the numbers list and continued until the number of items to copy was equal to the size (4) of the numbers (original sequence) list—the number of items to insert (3). It is also important to calculate the new index where the item will be copied over. The position 6 is calculated using the index plus count, where index refers to the start position of the copy from the source array (provided as input to the InsertRange, which is 3), and count refers to the number of items provided in the InsertRangemethod to be inserted into the original list. As a result, items {3} will be copied into the array position 6, but the capacity of the original list numbers is four. So it needs to expand the capacity to accommodate more items. After the expansion of the original list, the cell at position {3,4,5} of the expanded original list new values {22,33,77} will be copied. Listing 11-4 shows the usage of the InsertRange method.

Listing 11-4. Example of the InsertRange Method of the List<T> Class

using System;
using System.Collections;
using System.Collections.Generic;

namespace Ch11
{
class Program
{
static void Main(string[] args)
{
List<int> numbers = new List<int>()
{
0,1, 2, 3
};
numbers.InsertRange(3, new List<int>() { 22, 33, 77 });
ShowResult(numbers);
}

public static void ShowResult(IEnumerable aList)
{
foreach (var item in aList)
Console.Write("{0}\t", item);
Console.WriteLine();
}
}
}

The program will produce this output:

0 1 2 22 33 77 3

The InsertRange method works as demonstrated in Figure 11-7.

images

Figure 11-7. InsertRange operation in the List<T> class.

From Figure 11-7, you can see that:

· The CLR will copy the items from the original list, starting from the given position (3) (for Listing 11-4) until it reaches the number of items to copy, which is one, because it is calculated using this._size-index, which is (4 - 3) = 1. So it will copy the item from position 3 and store it in position 6 from the _items array in position 6.

· Insert the new values into cells three to five of the _items array, which will overwrite existing {3, 0, 0} items with new values {22, 33, 77}.

Deletion Operation in the List<T>

In List <T> class, you can use the Remove, RemoveAll, or Clear method to do the deletion operation or clear operation to the items from the list. Let’s see how Remove method works internally:

· When the CLR executes the Remove method, it takes an item that is going to be removed as the argument. It looks for the index position of that item in the _items array, using the IndexOf method. This will return the index of that item, and the CLR passes this index as input to the RemoveAtmethod.

· The RemoveAt method copies items from the position of the item index (which is going to remove) plus one until the number of items to copy reaches the current size of the _items array—the position of the item to be removed (index of the item to be removed)—in the destination array at the position of the index of the item to be removed.

· The CLR will set the default value of the type T to the last item of the _items array.

The implementation of the RemoveAt method is shown in Listing 11-5.

Listing 11-5. The Implementation of the RemoveAt Method of the List<T> Class

public void RemoveAt(int index) /* index - The position of the item which is going to
remove */
{
if (index >= this._size)
{
ThrowHelper.ThrowArgumentOutOfRangeException();
}
this._size--;
if (index < this._size)
{
Array.Copy(
this._items, /* Source array - from where
* items will be copied */
index + 1, /* Source index - the position from where
* the item will start copying */
this._items, /* Destination array - to where items will
* be copied over */
index, /* Destination index - the position in where
* the item will be placed after copied */
this._size - index /* Total number of item will be copied */
);
}
this._items[this._size] = default(T); /* Set the last item with the
* default value */
this._version++;
}

List<T> Class Members

This section examines different fields, constructors, and members used in the List<T>.

Constructors

The List<T> class has three overloaded constructors to initialize an instance of the List<T> class. Table 11-4 lists the three constructors that are used to instantiate the List<T> class.

images

Properties

Table 11-5 lists the different properties of the List<T> class.

images

Methods

The List<T> class has different methods that can be used to do different operations. Table 11-6 lists the different methods of the List<T> class.

images

ArrayList

The ArrayList class is another type of collection of the .NET. This section will explore in more detail the ArrayList class.

Capacity and Size of the ArrayList in .NET

The ArrayList is a dynamic array whose size dynamically increases as required. The number of elements that an ArrayList can hold refers to the capacity of the ArrayList. If the initial capacity (number of items it can hold) of the ArrayList is N on every phase of internal _items array expansion to accommodate more items, the CLR adds extra N items or increases to 2N. For example, the default capacity of the ArrayList is four, and it will be increased as 4 × 2 = 8, 8 × 2 = 16, 16 × 2 = 32, and so on. Figure 11-8 demonstrates the size and expansion of the ArrayList class.

images

Figure 11-8. ArrayList basics

When you instantiate the ArrayList class without specifying the initial Capacity, the capacity of the ArrayList is zero. On the first Add operation with the instance of the ArrayList, it will increase the capacity with the value of four (N), which is stored in the internal field _defaultCapacity of the ArrayListclass. It will be increased when needed to accommodate more items in the ArrayList instance, such as 4(N) × 2 = 8 (2N), 8(N) × 2 = 16 (2N), 16(N) × 2 = 32 (2N), and so on (as Figure 11-8 demonstrates). It means on every expansion of the ArrayList class, a new capacity of the ArrayList will be calculated and it will be two times that of the previous capacity. The actual items occupying the ArrayList are referred to as the Count property (internally the _size field) of the ArrayList class, and the total of the items can be added to the ArrayList referred by the Capacity property. The value of the capacity is updated when the CLR expands the capacity of the ArrayList to accommodate more items. For example, if you consider an instance of the ArrayList class whose current capacity is eight (refers by the Capacity property) but it actually contains five items, then the Count property of that instance of the ArrayList refers five (internally the _size field) and the capacity is eight. The capacity of the ArrayList can also be set:

· Explicitly by setting the required value in the Capacity property

· In the instantiation time of the ArrayList

· When the ArrayList will be instantiated based on another list of type ICollection, where the size of that list will be used as the initial capacity of the ArrayList

The capacity of the ArrayList can be decreased by calling the TrimToSize method. If an instance of the ArrayList has the capacity N (where N = 4) and only two items have been stored in that instance, then calling the TrimToSize method CLR will decrease the capacity of the ArrayList by two by removing N-2 cells from the internal array of the ArrayList class.

ArrayList Declaration

The ArrayList class defined in the System.Collections namespace of the mscorlib.dll (C:\Windows\Microsoft.NET\ Framework\v4.0.30319) assembly in .NET is demonstrated in Figure 11-9.

images

Figure 11-9. ArrayList class in the System.Collections namespace

The signature of the ArrayList class is:

public class ArrayList : IList, ICollection, IEnumerable, ICloneable

Let’s look at an example of the ArrayList usage, as shown in Listing 11-6.

Listing 11-6. Example of the ArrayList Class

using System;
using System.Collections;

namespace Ch11
{
class Program
{
static void Main(string[] args)
{
ArrayList terrestrialPlanets = new ArrayList();
terrestrialPlanets.Add("Mercury"); /* Default capacity is 4.*/
terrestrialPlanets.Add("Venus");
terrestrialPlanets.Add("Earth");
terrestrialPlanets.Add("Mars");

ShowResult(terrestrialPlanets);
}

public static void ShowResult(
IEnumerable aList, char mySeparator = ' ')
{
foreach (string item in aList)
Console.Write("{0}{1}", mySeparator, item);
Console.WriteLine();
}
}
}

This program will produce the output:

Mercury Venus Earth Mars

Instantiation of the ArrayList

ArrayList comes to life when you call one of the following three overloaded constructors:

public ArrayList() {}
public ArrayList(int capacity) {}
public ArrayList(ICollection c) {}

When the CLR executes the first version of the ArrayList constructor, it initializes the internal _items array with the default size zero. But on the first Add operation, it will be increased to four. For the second version of the constructor, it initializes the _items array with the given size provided as input to the capacity parameter. The CLR gets the size of the parameter c (refers to a collection) via the Count property of the c collection for the third version of the constructor and uses this size to initialize the _items array. It copies the entire contents of the c into the _items array. Figure 11-10demonstrates the ArrayList instantiation using the different constructors.

images

Figure 11-10. ArrayList class and its constructors

Addition and Insertion Operations in the ArrayList Class

To add any item into the ArrayList, the CLR checks whether the item can be stored in the _items array. The CLR ensures this by checking the current number of elements inside the ArrayList class using the EnsureCapacity method of the ArrayList class. If the size is ensured by the EnsureCapacity method, the CLR adds that item into the expanded _items array. Figure 11-11 demonstrates the Add and Insert operations in the ArrayList.

images

Figure 11-11. Add and Insert operations of the ArrayList class

The implementation of the Add method of the ArrayList class is demonstrated in Listing 11-7.

Listing 11-7. The Implementation of the Add Method of the ArrayList Class

public virtual int Add(object value)
{
if (this._size == this._items.Length)
{
this.EnsureCapacity(this._size + 1);
}
this._items[this._size] = value;
this._version++;
return this._size++;
}

The EnsureCapacity method will do the entire back-end job to make sure that the ArrayList is able to hold the required number (based on the available memory) of data inside it. The CLR calls this EnsureCapacity method with the value of the current _size field of the ArrayList + 1. The EnsureCapacitymethod calculates the current new capacity for the _items array, which sets this new value into the Capacity property of the ArrayList class. The implementation of the EnsureCapacity method for the ArrayList class is shown in Listing 11-8.

Listing 11-8. The Implementation of the EnsureCapacity Method of the ArrayList Class

/* min is current size (_size) of the List<T> + 1 */
private void EnsureCapacity(int min)
{

/* The _items.Legth refers to the current total length of the _items
* array or total. cells of the _items array where as _size field of
* the ArrayList refers to the currently used cells from the _items
* array of the ArrayList class. To able to add any item into
* the ArrayList, _size has to be < _items.Length */
if (this._items.Length < min)
{
int num = (this._items.Length == 0) ? 4 : (this._items.Length * 2);
if (num < 0x7fefffff)
{
num = 0x7fefffff;
}
if (num < min)
{
num = min;
}
this.Capacity = num;
}
}

The Capacity property of the ArrayList class creates a new array with the size of the new capacity value calculated using the EnsureCapacity method. It copies the existing items from the _items array into this new array. The new array will contain all the existing items from the _items array and four new empty cells. Finally, this new array will replace the existing _items array. Let’s look at the implementation of the Capacity property, as shown in Listing 11-9.

Listing 11-9. The Implementation of the Capacity Property of the ArrayList Class

public virtual int Capacity
{
get
{
return this._items.Length;
}
set
{
if (value != this._items.Length)
{
/* value refers to the new Capacity value ensured by
* the EnsureCapacity method.*/
if (value > 0)
{
/* Temporary array to hold existing item and new empty items.*/
object[] destinationArray = new object[value];

if (this._size > 0)
{
/* Copy the existing items into the new array. */
Array.Copy(
/* It refers the source array from where items
* will be copied */
this._items,
/* The position of the source array from where
* the copy will start */
0,
/* It refers the destination array in where items
* will be copied into */
destinationArray,
/* The position of the destination array
* in where the item will be placed after copying */
0,
/* Total number of item will be copied from the
* source array */
this._size
);
}
/* Replace _items with the temporary array which currently
* holding existing contents of the _items array and empty
* items for the expanded cells. */
this._items = destinationArray;
}
else
{
this._items = new object[4]; /* Otherwise set _items with 4
* items. */
}
}
}
}

Insert a Range of Items in the ArrayList

The InsertRange method works a bit differently from the Add method. The InsertRange method adds a series of items into the ArrayList, starting from a given position of the _items array. It started copying from the position 3 (as provided to the InsertRange method) of the numbers and continued until the number of items to copy was equal to the size (4) of the numbers (original sequence), that is, the number of items to insert (3). It is also important to calculate the new index from the items that will be copied. Position 6 is calculated using the index plus count, where the index refers to the start position of the copy from the source array (provided as of input to add to the InsertRange, which is 3), and count refers to the number of items provided in the InsertRange method to insert into the original list. As a result, item {3} will be copied at the array in position 6, but the capacity of the original list is four, so it needs to expand the capacity to accommodate more items. After the expansion of the original list, for the cell at position {3,4,5} from the expanded list, new values {22,33,77} will be copied. Listing 11-10 illustrates the InsertRange method.

Listing 11-10. Example of InsertRange Method of the ArrayList Class

using System;
using System.Collections;
using System.Collections.Generic;

namespace Ch11
{
class Program
{
static void Main(string[] args)
{
ArrayList numbers = new ArrayList()
{
0,1, 2, 3
};
numbers.InsertRange(3, new List<int>() { 22, 33, 77 });
ShowResult(numbers);
}

public static void ShowResult(IEnumerable aList)
{
foreach (var item in aList)
Console.Write("{0}\t", item);
Console.WriteLine();
}
}
}

The program will produce the output:

0 1 2 22 33 77 3

This work is shown in Figure 11-12.

images

Figure 11-12. InsertRange operation of the ArrayList class

From Figure 11-12, you can see that:

· The CLR copies the items from the original list and it starts from the given position (3 for Listing 11-10) until it reaches the number of items to copy (which is 1, because it is calculated using the _size – index, which is (4 - 3) = 1. So it will copy the item from position 3 and store it in the _items array at position 6.

· The CLR will insert the new values into the empty cells in positions 3 to 5 of the _items array.

Deletion Operation in the ArrayList

If you want to remove any item from the ArrayList, you can use the Remove method. The program in Listing 11-11 shows the usage of the Remove method of the ArrayList class.

Listing 11-11. Remove Item from the ArrayList

using System;
using System.Collections;

namespace Ch11
{
class Program
{
static void Main(string[] args)
{
ArrayList terrestrialPlanets = new ArrayList();
terrestrialPlanets.Add("Mercury");
terrestrialPlanets.Add("Venus");
terrestrialPlanets.Add("Earth");
terrestrialPlanets.Add("Mars");

terrestrialPlanets.Remove("Venus");
ShowResult(terrestrialPlanets);
}

public static void ShowResult(IEnumerable aList)
{
foreach (string item in aList)
Console.Write("{0}\t",item);
Console.WriteLine();
}
}
}

This program will produce the output:

Mercury Earth Mars

The CLR takes the following steps to execute terrestrialPlanets.Remove("Venus") statement:

1. Step 1: From the Remove method, the CLR will try to find the index of the Venus object from the internal array _items of the ArrayList class. It will then call the RemoveAt method internally with the index found for the Venus item.

images

Figure 11-13. Remove operation in the ArrayList class

2. Step 2: Inside the RemoveAt method, the CLR calls the Copy method of the Array class with a related argument such as calculated index of the item to remove and the _size field of the ArrayList class. This copy operation replaces one item at a time. The CLR copies the Earth and Mars to replace Venus and Earth of the _items array. The _items array will no longer have Venus in it. The CLR will set the contents of the position Mars (last occurrence) in the _items array with null. Figure 11-13 shows the removed last item set with a null value and (unless you call the TrimToSizemethod) the internal array _items holds the null value for the removed items. The implementation of the RemoveAt method is demonstrated in Listing 11-12.

Listing 11-12. The Implementation of the RemoveAt Method of the ArrayList Class

/* index - The position of the item which is going to remove */
public virtual void RemoveAt(int index)
{
this._size--;
if (index < this._size)
{
Array.Copy(
this._items, /* It refers the source array - from where
* items will be copied */
index + 1, /* The position of the source array from
* where the copy will start */
this._items, /* It refers the destination array in where
* items will be copied over */
index, /* The position of the destination array
* in where the item will be
* placed after copying */
this._size - index /* Total number of item will be copied from
* the source array */
);
}
this._items[this._size] = null;
this._version++;
}

ArrayList Class Members

This section will explore the different fields, constructors, and members used in the ArrayList.

Fields

Table 11-7 lists the different fields from the ArrayList class that the ArrayList class uses internally to implement the ArrayList functionality.

images

Constructors

The ArrayList class has three overloaded constructors to initialize an instance of the ArrayList class. Table 11-8 lists the three constructors that can be used to instantiate the ArrayList class.

images

Properties

Table 11-9 lists the different properties of the ArrayList class.

images

Methods

The ArrayList class has different methods that can be used to do different operations. Table 11-10 lists the different methods of the ArrayList class.

images

Array vs. ArrayList

An ArrayList is a sophisticated version of an array. Most situations that call for an array can use an ArrayList instead. In general, ArrayList is easier to use and has performance similar to an array of type object. Table 11-11 lists the elements of an Array and the ArrayList.

images

Stack

A Stack is a sequence of items that are accessible at only one end of the sequence. The last data item placed into the sequence is the first processed. All activity occurs at the top of the Stack. The Push operation places a data item on top of the Stack, but Pop removes a data item from top of the Stack.

Stack in .NET

When you instantiate the Stack class without specifying the capacity, the initial capacity of the Stack is ten. The initial capacity of the Stack will be set with the default capacity N (= 10), which is stored in the internal field _defaultCapacity of the Stack class by default or otherwise provided the different initial capacity. The Stack class uses the _size field internally to represent how many items the internal _array currently holds. If the current capacity of the Stack is ten and it holds three items, the _size field will be three. The capacity will be increased to 2N to expand the internal _arraywhen needed. Figure 11-14 demonstrates the behavior of the Stack operation.

images

Figure 11-14. Stack basics

If the stack’s current capacity is ten and it needs to add another item, it will increase the capacity to 10 × 2 = 20 (it will be 20 × 2 = 40, 40 × 2 = 80,80 × = 160, 160 × 2 = 320, 320 × 2 = 640, and so on to accommodate more items into the Stack). After adding the new item to the Stack, it will update the _size field by one, so after the addition the _size field will be 11.

Stack Declaration

In .NET, Stack class is defined in the System.Collections namespace of the mscorlib.dll (C:\Windows\ Microsoft.NET \Framework\v4.0.30319) assembly, as shown in Figure 11-15.

images

Figure 11-15. Stack class in the System.Collections namespace

The signature of the Stack class would be:

public class Stack : ICollection, IEnumerable, ICloneable

Let’s look at an example of Stack usage, as shown in Listing 11-13.

Listing 11-13. An Example of Stack Class

using System;
using System.Collections;

namespace Ch11
{
class Program
{
static void Main(string[] args)
{
Stack terrestrialPlanets = new Stack();
terrestrialPlanets.Push("Mercury");
terrestrialPlanets.Push("Venus");
terrestrialPlanets.Push("Earth");
terrestrialPlanets.Push("Mars");

ShowResult(terrestrialPlanets);
}

public static void ShowResult(IEnumerable aList)
{
foreach (string item in aList)
Console.Write("{0}\t",item);
Console.WriteLine();
}
}
}

This program will produce the output:

Mars Earth Venus Mercury

Instantiation of the Stack

The Stack comes to life when you call one of the three overloaded constructors:

public Stack() {}
public Stack(int initialCapacity) {}
public Stack(ICollection col) {}

When the CLR executes the first version of the Stack constructor, it initializes the internal _array with the default value ten. The second version of the constructor initializes the _array with a given size provided as the parameter initialCapacity. The CLR gets the size of the parameter col (using the value of the Count property of the col) for the third version of the constructor, initializes the _array with this size, and copies the entire contents of the col into the _array. Figure 11-16 demonstrates instantiation of the Stack constructors.

images

Figure 11-16. Stack class and its constructors

The implementation of the Stack constructor is shown in Listing 11-14.

Listing 11-14. The Implementation of the Overloaded Stack Constructor

public Stack()
{
this._array = new object[10];
this._size = 0;
this._version = 0;
}

public Stack(int initialCapacity)
{
if (initialCapacity < 10)
{
initialCapacity = 10;
}
this._array = new object[initialCapacity];
this._size = 0;
this._version = 0;
}

From the second constructor, you can see that it takes the initialCapacity to define the size of the internal _array. The third version of the Stack constructor accepts a Collection object as input and initializes the Stack object based on the given Collection object. Listing 11-15 demonstrates the implementation of this third constructor of the Stack class.

Listing 11-15. The Implementation of the Stack Constructor that Accepts a Collection as Input

public Stack(ICollection col) : this((col == null) ? 0x20 : col.Count)
{
if (col == null)
{
throw new ArgumentNullException("col");
}
IEnumerator enumerator = col.GetEnumerator();
while (enumerator.MoveNext())
{
this.Push(enumerator.Current);
}
}

Push Operation in the Stack

When the CLR pushes the element into the Stack, it checks whether the _size field and the length of the internal _array are the same. If they are equal, then the CLR:

· Creates a new temporary array locally with the size of the current _array x 2

· Copies the contents of the existing _array into this new array

· Copies this temporary local array into the _array

Figure 11-17 demonstrates the Push and Pop operation of the Stack class based on the internal _array of the Stack class where the CLR stores the item added to the Stack.

images

Figure 11-17. Push and Pop operation in the Stack

The implementation of the Push method is demonstrated in Listing 11-16.

Listing 11-16. The Implementation of the Push Method of the Stack Class

public virtual void Push(object obj)
{
/* _size refers to the number of cells from _array is being used. */
if (this._size == this._array.Length)
{
object[] destinationArray = new object[2 * this._array.Length];
Array.Copy(this._array, 0, destinationArray, 0, this._size);
this._array = destinationArray;
}
this._array[this._size++] = obj;
this._version++;
}

Peek and Pop Operation in the Stack

While the CLR executes the Peek method, it returns a copy of the top item from the _array of the Stack class, as demonstrated in Listing 11-17 using the implementation of the Peek method.

Listing 11-17. The Implementation Code for the Peek Method of the Stack Class

public virtual object Peek()
{
/* It copies the item stored at position (_size -1) from _array*/
return this._array[this._size - 1];
}

On the other hand, the Pop method will return the top-most items from the _array of the Stack and set the value as null for that position to remove the contents of the value at that position. Listing 11-18 demonstrates the implementation of the Pop method of the Stack class.

Listing 11-18. The Implementation of the Pop Method of the Stack Class

public virtual object Pop()
{
/* Copy the top most value from the _array into the obj2 */
object obj2 = this._array[--this._size];
/* Set the value as null at the top most position of the _array */
this._array[this._size] = null;
return obj2;
}

Clear Operation in the Stack

The Clear method of the Stack class resets all the existing items stored in the _array with their default values and sets the _size of the Stack class to zero. The Clear method does not resize the _array but removes all the values stored in the _array. The implementation of the Clear method is shown inListing 11-19.

Listing 11-19. The Implementation of the Clear Method of the Stack Class

public virtual void Clear()
{
Array.Clear(
this._array, /* The original array which to reset */
0, /* The starting index from where the reset will start */
this._size /* Total number of item to reset which is the size of
* the _array */
);
this._size = 0; /* Set the size of the Stack as 0 which makes the Count
* property as 0 as the Count property is return
* this._size; */
this._version++;
}

Stack Class Members

The Stack class provides several methods for Push and Pop items in a list.

Fields

Table 11-12 lists the different fields the Stack class uses internally to implement the stack functionality.

images

Constructors

The Stack class has three overloaded constructor to initialize an instance of the Stack class.

images

Properties

Table 11-14 lists the different properties of the Stack class.

images

Methods

The Stack class has different methods that can be used to do different operations. Table 11-15 lists the different methods used in the Stack class.

images

Queue

The Queue is the data structure that enables the CLR to enter data into the list in a way that ensures that the item first in will be the first out. A data item enters the Queue at the rear and leaves from the front. When an item is added to the end of the Queue, it is said to be Enqueue; but when it removes an item at the beginning of the Queue, it is said to be Dequeue.

Size of the Queue in .NET

The size of the Queue is represented by N, where N is 32, which is the default size of the Queue unless different values are defined for the size. Figure 11-18 demonstrates the Queue class.

images

Figure 11-18. Queue class basics

Queue Declaration

The Queue class defined in the System.Collections namespace of the mscorlib.dll assembly is demonstrated in Figure 11-19.

images

Figure 11-19. Queue in the System.Collections namespace

The signature of the Queue class is:

public class Queue : ICollection, IEnumerable, ICloneable

Let’s look at an example of the usage of the Queue class, as shown in Listing 11-20.

Listing 11-20. Example of the Queue Class

using System;
using System.Collections;

namespace Ch11
{
class Program
{
static void Main(string[] args)
{
Queue terrestrialPlanets = new Queue();
terrestrialPlanets.Enqueue("Mercury");
terrestrialPlanets.Enqueue("Venus");
terrestrialPlanets.Enqueue("Earth");
terrestrialPlanets.Enqueue("Mars");
ShowResult(terrestrialPlanets);
}

public static void ShowResult(IEnumerable aList)
{
foreach (string item in aList)
Console.Write("{0}\t", item);
Console.WriteLine();
}
}
}

This program will produce the output:

Mercury Venus Earth Mars

Instantiation of the Queue

The Queue comes to life when you call one of the following four overloaded constructors:

public Queue() {}
public Queue(int capacity) {}
public Queue(ICollection col) {}
public Queue(int capacity, float growFactor) {}

When the CLR executes the first version of the Queue constructor, it initializes the internal _array with the default value 32. The second version of the constructor initializes the _array with a given size provided as input to the capacity parameter. The CLR gets the size of the parameter col (using the value of the Count property of the col) for the third version of the constructor and uses this size to initialize the _array. It then copies the entire contents of the col into the _array. Figure 11-20 demonstrates the internal workings of the Queue constructors.

images

Figure 11-20. Queue class and its constructors

The implementation of the Queue constructors is shown in Listing 11-21.

Listing 11-21. The Implementation of the Constructor of the Queue Class

public Queue() : this(0x20, 2f){}
public Queue(ICollection col) : this((col == null) ? 0x20 : col.Count)
{
IEnumerator enumerator = col.GetEnumerator();
while (enumerator.MoveNext())
{
this.Enqueue(enumerator.Current);
}
}
public Queue(int capacity) : this(capacity, 2f){}
public Queue(int capacity, float growFactor)
{
this._array = new object[capacity];
this._head = 0;
this._tail = 0;
this._size = 0;
this._growFactor = (int) (growFactor * 100f);
}

Enqueue and Dequeue Operation in the Queue

When the CLR enqueues an element into the Queue, it checks whether the _size field and the length of the internal _array are the same. If they are equal, then the CLR:

· Ensures the capacity of the internal _array with the new length using the new length, which is the current length of the _array plus four

· Calls the SetCapacity method internally to set this new value for the Queue

· Puts the element at the top of the _array

Figure 11-21 demonstrates the Enqueue and Dequeue operations of the Queue class.

images

Figure 11-21. Enqueue and Dequeue operations in the Queue class

The implementation of the Dequeue method is demonstrated in Listing 11-22.

Listing 11-22. The implementation of the Dequeue Method of the Queue Class

public virtual object Dequeue()
{
object obj2 = this._array[this._head];
this._array[this._head] = null;
this._head = (this._head + 1) % this._array.Length;
this._size--;
this._version++;
return obj2;
}

The implementation of the Enqueue method is demonstrated in Listing 11-23.

Listing 11-23. The Implementation of the Enqueue Method of the Queue Class

public virtual void Enqueue(object obj)
{
if (this._size == this._array.Length)
{
int capacity = (int) ((this._array.Length * this._growFactor) / 100L);
if (capacity < (this._array.Length + 4))
{
capacity = this._array.Length + 4;
}
this.SetCapacity(capacity);
}
this._array[this._tail] = obj;
this._tail = (this._tail + 1) % this._array.Length;
this._size++;
this._version++;
}

Clear Operation in the Queue

The Clear method of the Queue class resets all the existing items stored in the _array with its default value, and it sets the _size, _head, and _tail of the Queue class with zero. The Clear method does not remove all the items stored in the _array but resets them with the default values. The implementation of the Clear method is demonstrated in Listing 11-24.

Listing 11-24. The Implementation of the Clear Method of the Queue Class

public virtual void Clear()
{
/*.........*/
Array.Clear(
this._array, /* The original array which to reset */
_head, , /* The starting index from where the reset will start */
this._size, /* Total number of item to reset*/
);
/*.........*/
this._head = 0;
this._tail = 0;
this._size = 0;
this._version++;
this._size = 0; , /* Set the size of the Queue as 0 which makes the Count,
* property as 0 as the Count property is return
* this._size; */
this._version++;
}

Queue Class Members

This section will explore the different fields, constructors, and members used in the Queue.

Fields

Table 11-16 lists the different fields the Queue class uses internally to implement queue functionality.

images

Constructors

The Queue class has four overloaded constructors to initialize an instance of the Queue class. Table 11-17 lists the four constructors that can be used to instantiate the Queue class.

images

Properties

Table 11-18 lists the different properties of the Queue class.

images

Methods

The Queue class has different methods that can be used to do different operations. Table 11-19 lists the different methods used in the Queue class.

images

Hashtable

A Hashtable class consists of the buckets array, which contains the elements of the collection. A bucket is a virtual subgroup of elements within the Hashtable, which makes searching and retrieving easier and faster than in most collections. The Hashtable in .NET represents a collection of key and value pairs that are organized based on the hash code of the key. A hash function is an algorithm that returns a numeric hash code based on a key. A hash function must always return the same hash code for the same key, and it also could generate the same hash code for two different keys (collision). But a hash function that generates a unique hash code for each unique key results in better performance when retrieving elements from the hash table.

Size of the Hashtable in .NET

The initial size of the Hashtable is three (the first prime number as shown in Table 11-20), and it will resize to accommodate more items using one of the values shown in Table 11-20 to resize the buckets array.

images

The CLR initializes its default size with the three from the primes array as shown in Table 11-20 and progressively adds the next prime number from the primes array to expand the size to accommodate more items. Otherwise, the CLR generates a new prime number manually to resize theHashtable.

Hashtable Declaration

The Hashtable class is defined in the System.Collections namespace of the mscorlib.dll (C:\Windows\Microsoft.NET\ Framework\v4.0.30319) assembly in .NET, as shown in Figure 11-22.

images

Figure 11-22. Hashtable class in the System.Collections namespace

The signature of the Hashtable class is:

public class Hashtable :
IDictionary, ICollection, IEnumerable, ISerializable,
IDeserializationCallback, ICloneable

An example of the Hashtable is shown in Listing 11-25.

Listing 11-25. Example of the Hashtable Class

using System;
using System.Collections;

namespace Ch11
{
class Program
{
static void Main(string[] args)
{
Hashtable terrestrialPlanets = new Hashtable();
terrestrialPlanets.Add("Mercury",
"The innermost of the eight planets.");
terrestrialPlanets.Add("Venus",
"The second planet from the Sun.");
terrestrialPlanets.Add("Earth",
"The third planet from the Sun.");
terrestrialPlanets.Add("Mars",
"The fourth planet from the Sun.");
terrestrialPlanets.Add("Vesta",
"One of the largest asteroids in the Solar System.");

Console.WriteLine(
"Mercury\t{0}\nVenus\t{1}\nEarth\t{2}\nMars\t{3}\nVesta\t{4}",
terrestrialPlanets["Mercury"],
terrestrialPlanets["Venus"],
terrestrialPlanets["Earth"],
terrestrialPlanets["Mars"],
terrestrialPlanets["Vesta"]);
}
}
}

This program will produce the output:

Instantiation of the Hashtable

The Hashtable comes to life when you call one of the following 15 overloaded constructors:

public Hashtable() {}
public Hashtable(bool trash) {}
public Hashtable(IDictionary d) {}
public Hashtable(IEqualityComparer equalityComparer) {}
public Hashtable(int capacity) {}
public Hashtable(IDictionary d, IEqualityComparer equalityComparer) {}
public Hashtable(IDictionary d, float loadFactor) {}
public Hashtable(IHashCodeProvider hcp, IComparer comparer) {}
public Hashtable(int capacity, IEqualityComparer equalityComparer) {}
public Hashtable(IDictionary d, IHashCodeProvider hcp,
IComparer comparer){}
public Hashtable(IDictionary d, float loadFactor, IEqualityComparer
equalityComparer){}
public Hashtable(int capacity, IHashCodeProvider hcp,
IComparer comparer){}
public Hashtable(int capacity, float loadFactor, IEqualityComparer
equalityComparer){}
public Hashtable(IDictionary d, float loadFactor, IHashCodeProvider hcp,
IComparer comparer) {}
public Hashtable(int capacity, float loadFactor, IHashCodeProvider hcp,
IComparer comparer){}

When the CLR executes the first version of the Hashtable constructor, it initializes the internal array buckets with the default size of three. In the Add operation, the CLR checks whether or not the item can be store in the buckets array by checking the current number of the elements in the Hashtableor otherwise increasing the size of the Hashtable to accommodate the new item in it.

Addition Operation in the Hashtable

When you add a new item into the Hashtable, CLR calculates the hash code for the associated key of the given item by calling the InitHash method:

/* num3 refers to the Hashcode of the key of the given item */
uint num3 = this.InitHash(key, this.buckets.Length, out num, out num2);

The InitHash method calculates the hash code for the key of the given item. The implementation of the InitHash method would be:

/*incr hold the value for the num2*/
private uint InitHash(object key, int hashsize, out uint seed, out uint incr)
{
uint num = (uint) (this.GetHash(key) & 0x7fffffff);
seed = num;
incr = 1 + ((uint) (((seed * 0x65) + 1) % (hashsize - 1)));
return num;
}

The CLR uses the num3 variable (contains hash code) to generate the slot index (referring to the position of the buckets array where the item will be added) of the buckets table where it stores the item (buckets is a one-dimensional array, with each of the cells referred to as a slot of the buckets). Each of the items of the buckets array contains a bucket, which is defined as a type of struct, as shown in Listing 11-26.

Listing 11-26. The Implementation of the Bucket Struct Used in the Hashtable Class

private struct bucket
{
public int hash_coll; /* To hold the Hash code of the Key */
public object key; /* To store the Key */
public object val; /* To store the associated value of the Key*/
}

The slot index will be calculated using the hash code of the given key and the current length of the buckets array:

/* num6 refers to the slot index of the buckets table. */ int num6 = (int) (num % this.buckets.Length);

The CLR uses this slot index as the position of the buckets table to add an item to the buckets table. After calculating the slot index, the CLR tries to add the item to the buckets based on three conditions: no collision, collision, and duplication checking. The next sections will explore each of these in detail.

No Collision While Adding an Item to the Hashtable

The CLR looks for the free slot in the buckets using the slot index (num6) to see whether it is already taken or if there is any duplicate key. If there is no collision, CLR adds the key and value into an instance of the bucket stored in the buckets array at the position referred by the slot index (num6), as shows in this implementation:

/* To check whether the item in the position of slot index (num6) of the
* buckets array contains any key or not by checking the bucket, return
* from the buckets array at position num6, holds null for the Key field */
if ((this.buckets[num6].key == null) ||
((this.buckets[num6].key == this.buckets) &&
((this.buckets[num6].hash_coll & 0x80000000L) == 0L)))
{
if (index != -1)
{
num6 = index;
}
Thread.BeginCriticalRegion();
this.isWriterInProgress = true;

/* Stores the value of the item into buckets array at the position of the
* slot index num6 calculated earlier */
this.buckets[num6].val = nvalue;

/* Stores the key of the item into buckets array at the at the position of
* the slot index num6 calculated earlier */
this.buckets[num6].key = key;

/* num3 refers the hash code generated from the Key */
this.buckets[num6].hash_coll |= (int) num3;

/* Increase the count which refers the total number of items stored in
* the buckets array.*/
this.count++;
this.UpdateVersion();
this.isWriterInProgress = false;
Thread.EndCriticalRegion();
}

As a result, the CLR adds the item at the slot index position (num6) of the buckets array.

Collision While Adding an Item to the Hashtable

If the slot at the position of the index (num6) from the buckets is already taken, CLR regenerates the slot index using the old slot index (so the value of the num6 will be regenerated). It also modifies the hash_coll value for the item stored at the position of the old generated slot index (num6) in thebuckets array by doing the (OR operation) for the old hash_coll value:

/* Updates the hashcoll value by doing the OR */
this.buckets[num6].hash_coll |= -2147483648;

This newly computed slot index (num6-regenerated) is used to check whether the buckets has a free slot at that position. If the CLR finds any free slot in the buckets, then it adds the given item to that slot or otherwise repeatedly looks for an available slot in the buckets. The implementation for this would be:

{
if ((index == -1) && (this.buckets[num6].hash_coll >= 0))
{
/* Change the old slot item's hash code value */
this.buckets[num6].hash_coll |= -2147483648;
this.occupancy++;
}

/* Generates new slot index and stores into num6. num2 refers in the
* InitHash method discussed earlier */
num6 = (int) ((num6 + num2) % ((ulong) this.buckets.Length));
if (++num4 < this.buckets.Length)
{
/* Go to the code block in where CLR will check for the availability
* of the empty slot */
}
}

When you try to add more items into the Hashtable but the total items in the Hashtable begins to exceeds the limit of the size of the Hashtable, the buckets size has to expand (the current buckets size is three and the program tries to add more, as shown in Listing 11-25). The CLR uses the next prime number (for example, 7) from the primes array and calls the rehash method to redistribute the existing items from the old buckets into new expanded buckets with new slot indexes calculated based on the new buckets length and hash code value for each of the existing keys stored in the buckets array. In this phase, CLR replaces all the negative hash_coll values (when the index has been taken CLR updates the hash_coll of that item, as discussed earlier) by doing the OR by 0x7fffffff, which rolls back the previous hash_coll value for that item:

bucket.hash_coll & 0x7fffffff

Duplicate Item While Adding to the Hashtable

When the CLR adds an item to the Hashtable, if there is any duplicate key found, the CLR throws an exception, because Hashtable does not allow a duplicate key. Figure 11-23 demonstrates the execution of the Add operation the CLR does in the Hashtable.

images

Figure 11-23. Add operation in the Hashtable.

From Figure 11-23, you can see that:

· The CLR initializes the buckets size with default value three and each of the buckets in the buckets is initialized with its default value. For instance, bucket struct has int of hash_coll, object of key, and val, so it is initialized as {0, null, null} for each of the buckets in the buckets.

· The first item, Mercury, is added to the buckets array at position 0 as the index 0 has been calculated.

· In the iteration of the Venus addition, the CLR generated the slot index 0 for Venus, but it is already taken by Mercury. So a new slot index is generated, which is 2, and the item Venus has been added at position 2 of the buckets array. In addition, the hash_coll value for the index 0 has been ORed by 0x7fffffff, and this ORed value will be rolled back on the buckets expansion and rehashing time.

· When CLR tries to add Earth into the Hashtable, it finds that it needs to expand the existing buckets. It expands the buckets, rehashes the existing items, and adds all the existing items into the buckets with updated slot index information. The new slot index is calculated again based on the newbuckets to add the Earth item.

· Based on this addition rule, Mars and Vesta items are added to the buckets of the Hashtable.

Deletion Operation in the Hashtable

When you try to remove any item stored in the Hashtable based on its key, the CLR calculates the hash code and uses this hash code to generate the index to locate the key in the buckets array. If the index contains the key you are trying to remove, the CLR sets the key and val for that item stored at that position of the buckets array with the null value and decreases the count field of the Hashtable by one.

Hashtable Class Members

This section will explore the different fields, constructors, and members used in the Hashtable.

Properties

Table 11-21 lists the different properties of the Hashtable class.

images

Methods

The Hashtable class has different methods that can be used to do different operations. Table 11-22 lists the different methods of the Hashtable class.

images

Dictionary

The Dictionary<TKey,TValue> is a generic data structure that represents a collection of Key, Value in C#. It maintains a buckets and entries table to implement the Dictionary functionality. The entries table contains an array of a struct entry to hold the key and value information, and the buckets table maintains indexes or position of the items on the basis of keys stored in the entries table. Figure 11-24 demonstrates the relationship between the buckets and entries table in the Dictionary<TKey,TValue> class.

images

Figure 11-24. Buckets and Entries Relationship in the Dictionary<TKey, TValue> Class

Size of the Dictionary in .NET

The Dictionary<TKey, TValue> class initializes the buckets and entries table with the default size three, but the size varies to accommodate more items placed into it. The CLR uses one of the values (prime numbers) shown in Table 11-23 to resize the buckets and entries table. Otherwise, the CLR generates a prime number manually to resize the buckets and entries table.

images

Dictionary Declaration

The Dictionary<TKey, TValue> class defined in the System.Collections.Generic namespace of the mscorlib.dll assembly in .NET is demonstrated in Figure 11-25.

images

Figure 11-25. Dictionary<TKey, TValue> in the System.Collections.Generic namespace

The signature of the Dictionary<TKey, TValue> class would be:

public class Dictionary<TKey, TValue> :
IDictionary<TKey, TValue>,
ICollection<KeyValuePair<TKey, TValue>>,
IEnumerable<KeyValuePair<TKey, TValue>>,
IDictionary,
ICollection,
IEnumerable,
ISerializable,
IDeserializationCallback

Listing 11-27 presents an example of the Dictionary<TKey, TValue> class.

Listing 11-27. Example of the Dictionary<TKey,TValue> Class

using System;
using System.Collections.Generic;

namespace Ch11
{
class Program
{
static void Main(string[] args)
{
IDictionary<string, string> terrestrialPlanets =
new Dictionary<string, string>();

terrestrialPlanets.Add("Mercury",
"The innermost of the eight planets.");
terrestrialPlanets.Add("Venus",
"The second planet from the Sun.");
terrestrialPlanets.Add("Earth",
"The third planet from the Sun.");
terrestrialPlanets.Add("Mars",
"The fourth planet from the Sun.");
terrestrialPlanets.Add("Vesta",
"One of the largest asteroids in the Solar System.");
terrestrialPlanets.Add("Ceres",
"The dwarf planet in the inner Solar System.");
terrestrialPlanets.Add("Pallas",
"One of the largest in the Solar System.");

Console.WriteLine( "Mercury\t{0}\nVenus\t{1}\nEarth\t{2}\nMars\t{3}\
nVesta\t{4}\nCeres\t{5}\nPallas\t{6}\n",
terrestrialPlanets["Mercury" ],
terrestrialPlanets["Venus" ],
terrestrialPlanets["Earth" ],
terrestrialPlanets["Mars" ],
terrestrialPlanets["Vesta" ],
terrestrialPlanets["Ceres" ],
terrestrialPlanets["Pallas" ]);
}
}
}

The program in the Listing 11-27 will produce this output:

Mercury The innermost of the eight planets.
Venus The second planet from the Sun.
Earth The third planet from the Sun.
Mars The fourth planet from the Sun.
Vesta One of the largest asteroids in the Solar System.
Ceres The dwarf planet in the inner Solar System.
Pallas One of the largest in the Solar System

Instantiation of the Dictionary

The Dictionary<TKey, TValue> comes to life when you call one of the following six overloaded constructors:

public Dictionary() {}
public Dictionary(IDictionary<TKey, TValue> dictionary) {}
public Dictionary(IEqualityComparer<TKey> comparer) {}
public Dictionary(int capacity) {}
public Dictionary(IDictionary<TKey, TValue> dictionary,
IEqualityComparer<TKey> comparer){}
public Dictionary(int capacity, IEqualityComparer<TKey> comparer) {}

When the CLR executes the constructor of the Dictionary, it initializes its internal data structure buckets and entries. The buckets table holds the indexes of the entries table, and the entries table holds the {Key, Value} information based on the index computed using the Key. The Key and Valueinformation is stored in the entries table using the Entry struct. The implementation of the Entry struct is shown in Listing 11-28.

Listing 11-28. The Implementation of the Entry Struct Used in the Dictionary Class

private struct Entry
{
public int hashCode; /* It holds the hash code for the Key */
public TKey key; /* It holds the key of the item */
public int next; /* It holds the index of an Entry from
* the entries table */
public TValue value; /* It holds the value of the item */
}

Addition Operation in the Dictionary

In the Add operation in Dictionary, the CLR first initializes two internal data structure buckets and entries. After initialization of these data structures, the CLR calculates the index of the buckets table, and in that position the entry of the buckets table points to the entries table that contains the key and values. The CLR will then add the related items in the Dictionary and store the key and values.

Bucket and Entry Initialization

When the CLR executes the Add operation in the Dictionary<TKey, TValue> for the first time, it initializes two data structures—buckets and entries—with its default value. For example, each of the items of the buckets initializes with -1 and each of the items in the entries table with its default value:

{
0 /* hashCode */,
null /* key */,
0 /* next item in the entry */,
null /* value */
}

Index Calculation

The CLR computes the hash code (Hc) for the given Key (Ki) using this statement:

/* Calculates the Hashcode */
int num = this.comparer.GetHashCode(key) & 0x7fffffff;

The CLR then computes the index using the Hc (num) and the length of the buckets with this statement:

int index = num % this.buckets.Length; /* Calculates the index */

This index Ib is used to find the available free cells (which have a value -1) in the buckets table, and if it finds any, it will store the position of the item from the entries table (where the related item was stored based on the count field of the Dictionary) as an index in the buckets table at the position of the Ib.

The index Ib is used as the position of the buckets table in which to store the position of the entries table, where the Key-associated value (for which the index Ib has been computed) has been stored.

No Collision, Collision, and Duplication Checking While Adding to the Dictionary

The CLR calculates the index of the buckets to store related items during this process where some collision has occurred. The following sections explore this in greater detail.

No Collision While Adding an Item to the Dictionary

The CLR checks whether there is any free cell for the index Ib (it used it as the position number to lookup in the buckets table) in the buckets table or whether the value of that index Ib position from the buckets has already been populated, as shown in the statement:

int i = this.buckets[index]; i >= 0; i = this.entries[i].next

The I > 0 expression indicates whether there is any item in the buckets at the position of the computed index Ib. If the result of the array look up (this.buckets[index] ) is -1 (-1 has been set by the CLR to initialize the bucket of buckets table), the bucket (cell) at that index position has not yet been used. Therefore, -1 means the related bucket cell has been initialized but not yet used. The CLR will use this cell to store the position of the item (associate item with the Key for which the Ib is computed) in the entries table (based on the freeList field of the Dictionary class uses as the position).

The CLR uses the current count (freeList uses the value of the count) of the Dictionary class as the position (which is 0 for the Key Mercury int as shown in Listing 11-27) to store the item in the entries table:

/* freeList holds the value of the count of the Dictionary */
this.entries[freeList].hashCode = num;
/* The value of buckets at position index Ib stores into the next to do the
* linked with collide data*/
this.entries[freeList].next = this.buckets[index];
this.entries[freeList].key = key;
this.entries[freeList].value = value;
/* Update the value of the buckets at position of the computed index Ib with
* the current count (freeList) value. So now the entries and buckets holding
* the same index for the Key */
this.buckets[index] = freeList;

The CLR also increases the count of the Dictionary class by one, and it will be used later as the position of the entries table to add the next item to the Dictionary.

Each of the items in the buckets table has the index information, which refers back to an index of the entries table where items associated with keys have been stored.

Collision and Duplication Checking While Adding an Item to the Dictionary

If the return value of the array look up (buckets[index]) is greater than -1, that means the cell of the buckets is being occupied. Therefore, there is an entry in the entries table, and the array look up return value refers to the position of the entries table in which an entry can be found. This is referred to as a collision. To resolve the collision, the CLR will first iterate through the entry (Ec) stored in the entries table at the position that stored (as value) the index Ib in the buckets table. It then checks whether the hash code of the Key is stored in the entry Ec item in the entries table and whether or not the related hash code for the current key (associated with the item to add) is the same. If it is the same, CLR throws the exception about this duplication.

Collision While Adding an Item to the Dictionary

If there is any collision for the given item (Gi), the CLR will then get the current count value of the Dictionary class via the freeList variable and add the item Gi at that position (freeList) into the entries table as an entry. For the next field of this entry, the CLR puts the entry located in the entries table at the position of the result of buckets[index], as shown in the code:

/* It adds the index of the entry from the entries table at position of
* the result of the buckets[index] into the next field of the current
* entry item stored in the entries table at position of the freeList*/
this.entries[freeList].next = this.buckets[index];

The CLR then updates the information stored in the current index Ib position of the buckets table with the value of the current freeList, which is the latest place where the current item Gi has been stored in the entries table:

/* It updates the information stored in the buckets at position of index (Ib)
* with the current freeList which is the latest count value. */
this.buckets[index] = freeList;

As a result, this ensures the buckets are stored with the right mapping indexes for all items in the entries table. Figure 11-26 demonstrates the Add operation in the Dictionary<TKey,TValue>.

images

Figure 11-26. CLR executing the Add operation in runtime

From Figure 11-26, you can see what the CLR does to add items in the Dictionary:

· The CLR initializes the buckets data with -1 and {0, null, 0, null} for each of the items in the entries table.

· When the CLR tries to add the key Mercury, it calculates the index to use to look up the buckets and check availability of that cell to store the index information of the entries table for the Mercury item. The Mercury-associated item will be stored in the entries table at position 0 and also stored at position 0 in the buckets array of the index.

· When the CLR tries to add Venus, it will calculate the index position, which is position 0, but it has already been taken by the Mercury entry. Therefore, the CLR will increase the current count field of the Dictionary, which then becomes position 1, and adds the Venus information at that position of the count of the Dictionary in the entries array and also links the item at position 0 with it (resolve collisions). The CLR stores the index information in the buckets table at position 0 with this updated new position information in the entries table. The item at position 0 in the buckets table will now be 1 instead of 0.

· When the CLR tries to add the Mars key in the Dictionary, it finds that it needs to expand the buckets and entries array. So after expansion, the CLR rehashs the existing item and mapping information stored in the buckets and entries tables. It will recalculate the index based on the Mars key and update the length of the buckets array. Finally, it adds Mars to the right position in the entries array and updates the mapping information at the buckets array.

· These same steps are repeat to add other items in the array.

Dictionary Look Up

The CLR follows several steps when it looks for an item based on the Key in the Dictionary. First, it accesses the index property to look up the Dictionary by providing the Key. It calls the FindEntry method where it will check whether or not the buckets is null. It will then compute the hash code for the given key (num) and compute the index to look up the buckets table using the hash code and the length of the buckets:

num % this.buckets.Length

The return value of the bucket [num % this.buckets.Length] will be the related index of the entries table used to find the key to match. Using the value returned from the bucket [num % this.buckets.Length ] statement, CLR loops through until it is > = 0; if it is not > 0, then it is -1, which means there is no other associated entry with this entry. As we learned in the initialization, when there is no item in the buckets, the default value is -1, as shown in the implementation of the Dictionary look up based on the Key in Listing 11-29.

Listing 11-29. The Implementation of the FindEntry Method of the Dictionary Class

private int FindEntry(TKey key)
{
if (this.buckets != null)
{
int num = this.comparer.GetHashCode(key) & 0x7fffffff;
for (int i = this.buckets[num % this.buckets.Length]; i >= 0;
i = this.entries[i].next)
{
if ((this.entries[i].hashCode == num) &&
this.comparer.Equals(this.entries[i].key, key))
{
return i;
}
}
}
return -1;
}

In this iteration, it will check whether the item from the entries has the same hash code and key as the look up Key, and if so the CLR stops iteration and returns the value of i, which refers to the index of the entries table. Using this position value, CLR returns an Entry from the entries table, with the implementation of the Index property as shown in Listing 11-30.

Listing 11-30. The Implementation of the Index of the Dictionary<TKey, TValue> Class

public TValue this[TKey key]
{
get
{
int index = this.FindEntry(key);
if (index >= 0)
{
return this.entries[index].value;
}

return default(TValue);
}
set {.....}
}

Dictionary Class Members

The Dictionary<TKey, TValue> class provides methods for creating, manipulating, and searching Keys and Values of items. The sections that follow explore the different fields, constructors, and members used in the Dictionary class.

Properties

Table 11-24 lists the different properties of the Dictionary class.

images

Methods

The Dictionary class has different methods that can be used to do different operations. Table 11-25 lists the different methods of the Dictionary class.

images

Summary

In this chapter, we have learned about the different collection types in .NET, such as Array, List, ArrayList, Stack, Queue, Hashtable, and Dictionary. You have learned how each of these classes dynamically ensures the capacity to accommodate required amounts of data, and you have learned how these classes have been implemented internally by using the array. In the next chapter, you will explore in depth the Linq in C#.