Linq in C# - Expert C# 5.0: with .NET 4.5 Framework (2013)

Expert C# 5.0: with .NET 4.5 Framework (2013)

CHAPTER 12

images

Linq in C#

This chapter will discuss the Language Integrated Query (Linq) in .NET and explore in detail the extension methods defined in the Enumerable class that are provided to do the Linq operation using C#. First, you will learn the basics of Linq and then examine the behind-the-scenes operations of each of the extension methods provided in the Enumerable class. Based on their delegate-based query syntax, you will learn the internal implementation of these extension methods using the help of ildasm.exe and .NET Reflector tool.

First Look into Linq in .NET

In .NET, any data structure that implements IEnumerable<T> from the System.Collections.Generic namespace of the mscorlib.dll assembly is able to access all the extension methods defined in the Enumerable class of the System.Linq namespace of the System.Core.dll assembly. I should mention before you dive into this that the delegate-based Linq query operators are defined in the Enumerable class of the System.Core.dll assembly, and a complete list of these operators is shown in Table 12-1. An example is provided in the following code, where the series has three items:

List<string> series = new List<string>(){"One", "Two", "Three"};

You can use the Where method to filter the list or Select to project into the output. Because List<T> type implements the IEnumerable interface and Where and Select extension method defined in the Enumerable class, it is the type IEnumerable<T>. For example, the Where extension method is a type ofIEnumerable<TSource> as shows in the following code:

public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source, Func<TSource, bool> predicate)

Any type that implements IEnumerable<T> is able to use the extension methods defined in the Enumerable class. For example, List<T> class implements IEnumerable<T> as the signature of the List<T> class as shown in the code:

public class List<T> :
IList<T>,ICollection<T>,IEnumerable<T>,IList,ICollection,IEnumerable

The Enumerable class is a static noninheritable class. The definition of the Enumerable class is demonstrated in the code:

.class public abstract auto ansi sealed beforefieldinit
System.Linq.Enumerable extends [mscorlib]System.Object

It is defined in the System.Linq namespace of the System.Core.dll assembly, as shown in Figure 12-1.

images

Figure 12-1. System.Linq.Enumberable class in the System.Linq

The static Enumerable class is a container for the different extension method types of IEnumerable<T>, for example:

public static bool Contains<TSource>(
this IEnumerable<TSource> source, TSource value) {}
public static int Count<TSource>(this IEnumerable<TSource> source) {}
public static IEnumerable<TSource> Distinct<TSource>(
this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer){}
/* and many more */

Extension Method

The extension method is the heart of the delegate-based Linq query operation. You can find more detail about the extension method in Chapter 4.

Lambda Expressions and Expression Trees in Linq

In .NET, Linq query methods (referring to those extension methods that are defined in the Enumerable class) allow you to perform different operations, such as, filtering, projection, key extraction, or grouping, over a sequence of items. In query operation, you can use the concept of the lambda expression in place of these functions (i.e., filtering, projecting), which will provide a convenient way to write functions that can be passed as arguments for subsequent evaluation. Lambda expression is a kind of CLR-delegated function, and it must encapsulate a method that is defined by a delegate type. Lambda expressions are similar to CLR delegates, and they must adhere to a method signature defined by a delegate type. For example:

string expressionName = "Lambda Expression";
Func<string, int> filter = item => item.Length >3;

We can use this filter lambda in the Where method, for example:

/* To filter the string in expressionName variable */
expressionName.Where(filter);

You could also write this:

expressionName.Where( item => item.Length >3 );

In general, you are free to use named methods, anonymous methods, or lambda expressions with query methods. Lambda expressions have the advantage of providing the most direct and compact syntax for authoring. Linq also allows you to use Expression<T> (defined in the System.Expresionnamespace of the System.core.dll assembly) to define the lambda expression. When you use this Expression<T> to create a lambda expression, the C# compiler generates an expression tree for the lambda expression rather than defining the method body. An example of the expression tree usingExpression<T> would be:

Expression<Predicate<int>> expression = n => n < 10;

In compile time, the C# compiler would determine whether to emit an IL instruction or an expression tree, depending on the usage of the lambda expression in the query operator. The compiler determines whether to generate an expression tree or the method based for the query methods using the following basic rules:

· When a lambda expression is assigned to a variable, field, or parameter whose type is a delegate, the compiler emits IL that is identical to that of an anonymous method.

· When a lambda expression is assigned to a variable, field, or parameter whose type is Expression<T>, the compiler emits an expression tree instead.

This chapter will explore the Linq query methods that can be used for delegate-based method.

Deferred Execution in Linq

Deferred execution is a pattern of the execution model by which the CLR ensures a value will be extracted only when it is required from the IEnumerable<T>-based information source. When any Linq operator uses the deferred execution, the CLR encapsulates the related information, such as the original sequence, predicate, or selector (if any), into an iterator, which will be used when the information is extracted from the original sequence using ToList method or ForEach method or manually using the underlying GetEnumerator and MoveNext methods in C#. Figure 12-2 demonstrates the deferred execution in Linq.

images

Figure 12-2. Example of the Deferred Execution

For example, as you can see in Figure 12-2, in order to extract data from the sequence numbers the CLR prepares related information for an Iterator object, and in the result fetch phase it will actually execute the iterator to get data from the sequence using the predicate and/or selector, which was encapsulated in the preparation phase.

In C#, the deferred execution is supported by directly using the yield keyword within the related iterator block. Table 12-1 lists the iterators in C# that use the yield keyword to ensure the deferred execution.

images

images

Query Methods in Linq

Here we will examine the behind-the-scenes operation of the Linq query methods that implement delegate-based syntax. All the query methods have categories, as shown in Table 12-2, which summarizes all the extension methods and their associated categories.

images

images

Filtering- and Projection-Based Methods

The filtering method Where and the projection method Select will be explored in this section.

Where and Select

The Where and Select are two of the common extension methods of the IEnumerable<TSource> to filter and project data of a list or sequence. The list works with these two extension methods and are needed to implement IEnumerable<T>. The signature of the Where and Select extensions would be:

public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source, Func<TSource, bool> predicate)
public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source, Func<TSource, int, bool> predicate)

Listing 12-1 demonstrates the use of the Where and Select extension methods.

Listing 12-1. Example of Where and Select Clause

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<string> numbers = new List<string>()
{
"One", "Two", "Three", "Four", "Five", "Six", "Seven"
};

var numbersLengthThree =
numbers.Where(x => x.Length == 3).Select(x => x).ToList();

numbersLengthThree.ForEach(x => Console.WriteLine(x));
}
}
}

Listing 12-1 creates a sequence of strings and stores them in numbers, which is a type of IList<string>, and then filters those items from the numbers whose total number of characters is equal to three and projects the results into a new list numbersLengthThree. Using the ForEach method, it iterates through the new list and displays the output on the console. The output of Listing 12-1 would be:

One
Two
Six

Internal Operation of the Where and Select Execution

Let’s analyze the code in Listing 12-1 carefully to really understand how the CLR handles the Where and Select extension method while executing the program. Focus on the following line from Listing 12-1 to see how CLR handles it:

numbers.Where(x => x.Length == 3).Select(x => x).ToList()

This line of code is used to filter and project data from the numbers sequence. Figure 12-3 demonstrates the Where and Select extension method based on Listing 12-1.

images

Figure 12-3. How the Where and Select extension methods work

From Figure 12-3, you can see that:

· The CLR passes the numbers sequence as input to the Where method along with an instance of the MulticastDelegate, which holds the information about the <Main>b__1 method compiled by the C# compiler for an anonymous method block (x=>x.Length == 3) in the compile time.

· The Where method returns an instance of the WhereListIterator<string> iterator, which is then used in the Select method along with another instance of the MulticastDelegate, which holds the information about the method <Main>b__2 created from an anonymous method (x=>x) in the compile time.

· The Select method instantiates the relevant iterator based on the input enumerable; for example, the WhereSelectListIterator<string, string> iterator for the program in Listing 12-1. Until this point, CLR will not process the original list due to the deferred execution.

· The CLR passes this iterator instance as input to the ToList method, which finally processes the original list by iterating through the sequence using filtering criteria (<Main>b__1) to get a new list as output using the projection(<Main>b__2).

Execution of the Where Method

This section examines the steps the CLR takes when executing the Where method.

1. Step 1: The C# compiler compiles an anonymous method block x => x.Length == 3 from the Where method into <Main>b__1. Following the decompiled IL code the C# compiler generates for the anonymous method x => x.Length == 3:

.method private hidebysig static bool <Main>b__1(string x) cil managed
{
/* Code removed */
L_0000: ldarg.0

/* To get the Length of the string */
L_0001: callvirt instance int32 [mscorlib]System.String::get_Length()
L_0006: ldc.i4.3

/* Check the equality of the length */
L_0007: ceq
L_0009: stloc.0
L_000a: br.s L_000c
L_000c: ldloc.0
L_000d: ret
}

The equivalent C# code would be:

private static bool <Main>b__1(string x) {
return (x.Length == 3);
}

The CLR instantiates an instance of the MulticastDelegate using the method <Main>b__1 that will be used as the delegate to filter items from the sequence. In the compile time, the C# compiler also created another method, <Main>b__2, for the (x=>x), as demonstrated in the following IL code:

.method private hidebysig static string <Main>b__2(string x) cil managed
{
/* Code removed */
L_0000: ldarg.0
L_0001: stloc.0
L_0002: br.s L_0004
L_0004: ldloc.0
L_0005: ret
}

Or with the equivalent C# code:

private static string <Main>b__2(string x)
{
return x;
}

The CLR used <Main>b__2 to instantiate another instance of the MulticastDelegate, and it will be used as the projector to project items from the filtered sequence.

2. Step 2: When the CLR starts execution of the program in Listing 12-1, it starts execution of the Where method. It passes the original sequence numbers and instance of MulticastDelegate (created in step 1) as input, as demonstrated in the following decompiled IL code of the Main method from Listing 12-1:

.method private hidebysig static void Main(string[] args) cil managed {
.entrypoint
.maxstack 4
.locals init (
[0] class [mscorlib]System.Collections.Generic.IList'1<string> numbers
/* Code removed */)
L_0000: nop

/* Step 1: Created List<string> object for the numbers */
L_0001: newobj instance void
[mscorlib]System.Collections.Generic.List'1<string>::.ctor()
/* Code removed */
L_005c: stloc.0

/* Step 2: Load the numbers variable into the evaluation stack */
L_005d: ldloc.0
L_005e: ldsfld class
[mscorlib]System.Func'2<string, bool> /*Field type */
Ch12.Program::CS$<>9__CachedAnonymousMethodDelegate4
L_0063: brtrue.s L_0078
L_0065: ldnull
L_0066: ldftn bool Ch12.Program::<Main>b__1(string)

L_006c: newobj instance void
[mscorlib]System.Func'2<string, bool>::.ctor(object, native int)

L_0071: stsfld class
[mscorlib]System.Func'2<string, bool>
/* It is holding the <Main>b__1 now */
Ch12.Program::CS$<>9__CachedAnonymousMethodDelegate4

L_0076: br.s L_0078
L_0078: ldsfld class
[mscorlib]System.Func'2<string, bool>
Ch12.Program::CS$<>9__CachedAnonymousMethodDelegate4

/* Step 3: CLR will call the Where extension by passing numbers retrieved
* into the evaluation stack at L_005c and the delegate object of
* System.Func'2<string, bool> (inherited from MulticastDelegate)
* at L_0078 */
L_007d: call
/* return type of the Where */
class [mscorlib]System.Collections.Generic.IEnumerable'1<!!0>
[System.Core]System.Linq.Enumerable::Where<string>(
/* First parameter type, CLR will pass value for this */
class [mscorlib]System.Collections.Generic.IEnumerable'1<!!0>,
/* Second parameter type CLR will pass value for this */
class [mscorlib]System.Func'2<!!0, bool>
)
/* Code removed */
}

3. Step 3: Based on the data type of the numbers, the Where method returns the appropriate iterator instance as output. The implementation of the Where method to return an appropriate iterator is demonstrated in Listing 12-2.

Listing 12-2. The Implementation of the Where Method

public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
if (source is Iterator<TSource>)
return ((Iterator<TSource>) source).Where(predicate);
if (source is TSource[])
return new WhereArrayIterator<TSource>((TSource[]) source, predicate);
if (source is List<TSource>)
return new WhereListIterator<TSource>(
(List<TSource>) source, predicate);
return new WhereEnumerableIterator<TSource>(source, predicate);
}

public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source, Func<TSource, int, bool> predicate)
{
return WhereIterator<TSource>(source, predicate);
}

You can find the complete list of iterator classes used by the different extension methods in the Linq in the System.Core.dll assembly. Figure 12-4 shows the iterator used for the Where method.

images

Figure 12-4. Iterator classes used in the Enumerable class of the System.Core.dll assembly

For example, the CLR returns the WhereListIterator<TSource> iterator from the Where method, which contains the original list as the source sequence and <Main> b__1 as the predicate. The CLR passes this WhereListIterator<TSource> as input to the Select method.

Execution of the Select Method

While executing the Select method, the CLR instantiates the relevant iterator based on the input of the IEnumerable object and the selector delegate. For the program in Listing 12-1, the CLR will return an instance of the WhereSelectListIterator<TSource,TResult> as output from the Select method. The implementation of the Select method to return an appropriate iterator is demonstrated in Listing 12-3.

Listing 12-3. Implementation Code for the Select Method

public static IEnumerable<TResult> Select<TSource, TResult>(
this IEnumerable<TSource> source, Func<TSource, TResult> selector)
{
if (source is Iterator<TSource>)
return ((Iterator<TSource>) source).Select<TResult>(selector);
if (source is TSource[])
return new WhereSelectArrayIterator<TSource, TResult>(
(TSource[]) source, null, selector);
if (source is List<TSource>)
return new WhereSelectListIterator<TSource, TResult>(
(List<TSource>) source, null, selector);

return new WhereSelectEnumerableIterator<TSource, TResult>(source, null, selector);
}

public static IEnumerable<TResult> Select<TSource, TResult>(
this IEnumerable<TSource> source, Func<TSource, int, TResult> selector)
{
return SelectIterator<TSource, TResult>(source, selector);
}

The iterator returned by the Select method contains the original list, the predicate delegate (<Main>b__1), and the selector or projector delegate (<Main>b__2).

4. Step 4: The CLR will not process the sequence until the program calls the ToList method or uses the ForEach method due to the deferred execution. The CLR passes this WhereSelectListIterator<string,string> iterator instance as input to the ToList method. In the ToList method, the CLR instantiates an instance of the List<string> by passing the WhereSelectListIterator<string,string> iterator as input to it.

5. Step 5: The CLR iterates through the enumerator to get the item from the input parameter collection (WhereSelectListIterator<string,string>) and applies the filtering criteria on it and adds the result into a dynamic _items array (a field of the List<TSource> class). This list object is returned as a result for the original list. The implementation of the List<TSource> constructor is demonstrated in the new list instantiation process for the Where and Select methods, as shown in Listing 12-4.

Listing 12-4. Constructor of the List<TSource>

public list(ienumerable<t> collection)
{
/* copy the original list into is2 */
icollection<t> is2 = collection as icollection<t>;
if (is2 != null)
{
int count = is2.count;

/* Initialize the _items using number of item in the original list.*/
this._items = new t[count];
is2.copyto(this._items, 0);
this._size = count;
}
else
{
this._size = 0;
this._items = new t[4];
using (ienumerator<t> enumerator = collection.getenumerator())
{
/* iterate through each of the item from the sequence and execute
* the filtering function Predicate over that item if it returns
* true then that item will be add to the new list. */
while (enumerator.movenext())
{
/* This add method internally work as dynamic array by making
* sure the size of the array using EnsureCapacity method
* (See Chapter 11 for the more about the
* EnsureCapacity method).*/
this.add(enumerator.current);
}
}
}
}

You have explored in detail the Where and Select extension method to filter and project data from a sequence. The following section will examine the partition-based extension methods of the Linq that can be used to manipulate the sequence.

Partitioning-Based Methods

This section will explore the partitioning-based methods, such as Skip, SkipWhile, Take, and TakeWhile.

Skip

The Skip method is used to bypass a specified number of elements from a sequence and then return the remaining elements as output. Due to the deferred execution, the immediate return value is an object of the relevant iterator type that stores all the information that is required to perform the skip operation. The Skip method iterates through the list and skips the specified number of items from the beginning of the list. The specified number is provided as a parameter of this method. The signature of the Skip extension method is:

public static IEnumerable<TSource> Skip<TSource>(this IEnumerable<TSource> source, int count)

The program in Listing 12-5 creates a sequence of strings into the numbers that holds One, Two, Three, Four, and Five as items of this sequence.

Listing 12-5. Example of the Skip Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<string> numbers = new List<string>()
{
"One","Two","Three", "Four","Five"
};

var result = numbers.Skip(2);
result.ToList().ForEach(number => Console.WriteLine(number));
}
}
}

The program in Listing 12-5 will produce the output:

Three
Four
Five

Internal Operation of the Skip Method of Execution

Let’s look at how the CLR executes the numbers.Skip(2) from the code in Listing 12-5.

1. Step 1: The CLR returns the SkipIterator<TSource> (which holds the original list and the count to define the number of items to skip) from the Skip method.

2. Step 2: Due to the deferred execution pattern, this SkipIterator<TSource> is executed by the CLR while, for example, iterating the sequence used for the ToList method, as shown in Listing 12-5. Inside the SkipIterator<TSource>, the CLR runs a loop and it continues until the number of items to skip becomes zero. During this iteration, the CLR moves the current position of the inner Enumerator object. When the number of items becomes zero, it loops through the list again to return the remaining items from the original sequence as output, as demonstrated in Listing 12-6.

Listing 12-6. Implementation Code for the Skip Method

private static IEnumerable<TSource> SkipIterator<TSource>(
IEnumerable<TSource> source,
int count)
{
using (IEnumerator<TSource> iteratorVariable0 = source.GetEnumerator())
{
/* Skip items from the begin of the list as long as count > 0 */
/* Outer loop */
while ((count > 0) && iteratorVariable0.MoveNext())
count--;
/* As soon as count becomes 0 CLR returns the rest of the item
* as output. */
if (count <= 0)
{
/* The CLR will start iterating the original list from the point
* the CLR left from the outer loop */
/* Inner loop */
while (iteratorVariable0.MoveNext())
yield return iteratorVariable0.Current;
}
}
}

SkipWhile

The SkipWhile extension method bypasses elements from the sequence as long as a specified condition is true and it returns the remaining elements as output. The signature of this extension method is:

public static IEnumerable<TSource> SkipWhile<TSource>(
this IEnumerable<TSource> source, Func<TSource, bool> predicate)
public static IEnumerable<TSource> SkipWhile<TSource>(
this IEnumerable<TSource> source, Func<TSource, int, bool> predicate)

An example of the use of the SkipWhile method is provided in Listing 12-7.

Listing 12-7. Example of the SkipWhile Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<string> numbers = new List<string>()
{
"One","Two","Three", "Four","Five"
};

var result = numbers.SkipWhile(number => number.Length == 3);
result.ToList().ForEach(number => Console.WriteLine(number));
}
}
}

The program in Listing 12-7 will produce the output:

Three
Four
Five

Internal Operation of the SkipWhile Method of Execution

The output shows that the SkipWhile method excluded those items whose length is equal to three in the output sequence. Let’s analyze the code in Listing 12-7 in detail to understand what’s happening when the CLR executes the SkipWhile method.

1. Step 1: The C# compiler constructs a method <Main>b__1 using the anonymous method (number => number.Length == 3) in the compile time. The CLR will instantiate an instance of MulticastDelegate using the <Main>b__1 method as a predicate to the SkipWhile. The CLR instantiates an instance of the SkipWhileIterator iterator using the original list and the predicate <Main>b__1. Due to the deferred execution, the SkipWhileIterator will not execute until the CLR calls the ToList or uses the ForEach method to iterate through.

2. Step 2: The CLR executes the ToList method over the SkipWhileIterator return from the SkipWhile method, and inside the SkipWhileIterator, the CLR loops through the original list one by one and executes the predicate over each of the items. If the predicate returns false, then theSkipWhileIterator returns that item as a result of the SkipWhile method; or if it returns true, then it continues through the list until it finishes. The implementation of the SkipWhileIterator used by the CLR at runtime is shown in Listing 12-8.

Listing 12-8. Implementation Code of the SkipWhileIterator

private static IEnumerable<TSource> SkipWhileIterator<TSource>(
IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
bool iteratorVariable0 = false;
foreach (TSource iteratorVariable1 in source)
{
if (!iteratorVariable0 && !predicate(iteratorVariable1))
iteratorVariable0 = true;

if (iteratorVariable0)
yield return iteratorVariable1;
}
}

Take

The Take method is used to return the specified number of elements from the list. An example of the Take method is shown in Listing 12-9.

Listing 12-9. Example of the Take Method

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> series = new List<int>()
{
1,2,3,4,5,6,7
};
series.Take(4)
.ToList()
.ForEach(number =>
Console.Write(string.Format("{0}\t",number)));
}
}
}

This program will produce the output:

1 2 3 4

It will return the TakeIterator, which will encapsulate the original list and the count that holds the number of elements to take. This will execute as soon as the ToList or Foreach method is executed by the iterator.

TakeWhile

The TakeWhile method is used to extract those items from a sequence that meet a provided condition. Listing 12-10 shows the use of the TakeWhile method.

Listing 12-10. Example of the TakeWhile Method

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> series = new List<int>()
{
1,2,3,4,5,6,7
};
Console.WriteLine("When the condition is true");
series.TakeWhile(number => number < 3)
.ToList()
.ForEach(number => Console.Write(string.Format("{0}\t",
number)));

Console.WriteLine("\nOn first false return iteration will stop ");
series.TakeWhile(number => number > 3)
.ToList()
.ForEach(number => Console.Write(string.Format("{0}\t",
number)));
}
}
}

This program produced the output:

When the condition is true
1 2
On first false return iteration will stop

Internal Operation of the TakeWhile Method of Execution

The TakeWhile method will return TakeWhileIterator with the original list and the predicate and it will pass this iterator to the List class while executing the ToList method over it. ToList will create a new List and in the List constructor it will loop through the iterator, which is TakeWhileIterator. It will return the items from the original list based on several steps. As long as the condition is true for the item in the list, it will continue down the list to take the item from it. But with the first occurrence of a false return from the predicate, the CLR will stop iterating the original list to take the item and return with whatever it has found so far (i.e., whatever items from the list meet the condition and only those). The code for implementing this would be:

private static IEnumerable<TSource> TakeWhileIterator<TSource>(
IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
foreach (TSource iteratorVariable0 in source)
{
/* While the predicate return false for a item from the original list it will break the
iteration*/
if (!predicate(iteratorVariable0))
{
break;
}
/* Otherwise return that item from the original list */
yield return iteratorVariable0;
}
}

Concatenation Methods

This section discusses the operations of the Concat method in detail.

Concat

The Concat extension method concatenates two sequences into one sequence. This method differs from the Union method because the Concat<TSource> returns all the original elements in the input sequences regardless of the duplicates, whereas the Union method returns only unique elements from the sequences. The signature for this extension method would be:

public static IEnumerable<TSource> Concat<TSource>(
this IEnumerable<TSource> first, IEnumerable<TSource> second)

The program in Listing 12-11 demonstrates the Concat method.

Listing 12-11. Example of the Concat Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> listOne = new List<int>()
{
1,2,3,4,5,6
};

IList<int> listTwo = new List<int>()
{
6,7,8,9,10
};

var result = listOne.Concat(listTwo).ToList();
result.ForEach(x => Console.Write(string.Format("{0}\t",x)));
}
}
}

The program in Listing 12-11 will produced the output:

1 2 3 4 5 6 6 7 8 9
10

Internal Operation of the Concat Method of Execution

The CLR concatenates items from the sequence listOne and listTwo into a new sequence that holds all the items, for example, { 1,2,3,4,5,6,6,7,8,9,10 }, with duplicated values. Figure 12-5 demonstrates how the Concat method works internally.

images

Figure 12-5. How the Concat extension method works

Figure 12-5 shows that the CLR passes the listOne and listTwo sequences as input to the Concat method and returns an instance of ConcatIterator<int> as output to the caller. Due to the deferred execution pattern used by the Concat method, the ToList method iterates through the items from the sequence and ConcatIterator<int> Concat listOne and listTwo sequences based on concatenate logic to produce the final output sequence. Let’s analyze the code in Listing 12-11 to help explain what’s happening when the CLR executes the Concat method.

1. Step 1: The CLR passes the original lists listOne and listTwo to the Concat<TSource> method as input.

2. Step 2: In the Concat method, the CLR instantiates the ConcatIterator<int> iterator, which holds the listOne and listTwo and returns to the caller of the Concat method. The implementation of the Concat method would be:

public static IEnumerable<TSource> Concat<TSource>(
this IEnumerable<TSource> first,
IEnumerable<TSource> second)
{
return ConcatIterator<TSource>(first, second);
}

3. Step 3: The ToList method loops through the lists, such as listOne and listTwo, via the Enumerator object returned from the ConcatIterator<int> instance and inserts each of the items from listOne and listTwo into a new list and returns this new list as a result of the Concat extension.

Ordering Methods

This section will explore the ThenBy and Reverse ordering methods.

ThenBy

The ThenBy method performs a subsequent ordering of the elements in a sequence in ascending order. Because it follows the deferred execution pattern in ThenBy, the immediate return value is an object of the relevant type that stores all the information, such as the original list, key selector, and so forth. The signature of the ThenBy extension method is:

public static IOrderedEnumerable<TSource> ThenBy<TSource, TKey>(
this IOrderedEnumerable<TSource> source, Func<TSource, TKey> keySelector)
public static IOrderedEnumerable<TSource> ThenBy<TSource, TKey>(
this IOrderedEnumerable<TSource> source,
Func<TSource, TKey> keySelector, IComparer<TKey> comparer)

Listing 12-12 demonstrates the ThenBy method.

Listing 12-12. Example of the ThenBy Extension Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<Person> persons = new List<Person>()
{
new Person(){ Name="Person F", Address= "Address of F",
Id= 111116},
new Person(){ Name="Person G", Address= "Address of G",
Id= 111117},
new Person(){ Name="Person C", Address= "Address of C",
Id= 111113},
new Person(){ Name="Person B", Address= "Address of B",
Id= 111112},
new Person(){ Name="Person D", Address= "Address of D",
Id= 111114},
new Person(){ Name="Person A", Address= "Address of A",
Id= 111111},
new Person(){ Name="Person E", Address= "Address of E",
Id= 111115}
};

var result =
persons.OrderBy(person => person.Id).ThenBy(person => person);

foreach (Person person in result)
{
Console.WriteLine("{0,-15} {1,-20}{2,-20}",
person.Name,
person.Address,
person.Id);
}
}
}

public class Person
{
public string Name
{
get;
set;
}

public string Address
{
get;
set;
}

public double Id
{
get;
set;
}
}
}

This program produced the output:

Person A Address of A 111111
Person B Address of B 111112
Person C Address of C 111113
Person D Address of D 111114
Person E Address of E 111115
Person F Address of F 111116
Person G Address of G 111117

Internal Operation of the ThenBy Method of Execution

This method works as demonstrates in the steps that follow.

1. Step 1: It will call the CreateOrderedEnumerable method, as shown in Listing 12-13.

Listing 12-13. Implementation of the ThenBy Method

public static IOrderedEnumerable<TSource> ThenBy<TSource, TKey>(
this IOrderedEnumerable<TSource> source,
Func<TSource, TKey> keySelector)
{
return source.CreateOrderedEnumerable<TKey>(keySelector, null, false);
}

2. Step 2: CreateOrderedEnumerable method instantiates an instance of the OrderedEnumerable, as shown in Listing 12-14.

Listing 12-14. Implementation of the CreateOrderedEnumerable Method

IOrderedEnumerable<TElement> IOrderedEnumerable<TElement>.CreateOrderedEnumerable<TKey>(
Func<TElement, TKey> keySelector,
IComparer<TKey> comparer,
bool descending)
{
return new OrderedEnumerable<TElement, TKey>(
this.source,
keySelector,
comparer, descending)
{ parent = (OrderedEnumerable<TElement>) this };
}

3. Step 3: The OrderedEnumerable class is implemented, as shown in Listing 12-15.

Listing 12-15. Implementation for the OrderedEnumerable Class

internal class OrderedEnumerable<TElement, TKey> : OrderedEnumerable<TElement>
{
/* Code removed*/
internal OrderedEnumerable(
IEnumerable<TElement> source,
Func<TElement, TKey> keySelector,
IComparer<TKey> comparer, bool descending) { /*Code removed*/ }

internal override EnumerableSorter<TElement> GetEnumerableSorter(
EnumerableSorter<TElement> next)
{
EnumerableSorter<TElement> enumerableSorter = new
EnumerableSorter<TElement, TKey>(
this.keySelector,
this.comparer,
this.descending, next);
if (this.parent != null)
{
enumerableSorter =
this.parent.GetEnumerableSorter(enumerableSorter);
}
return enumerableSorter;
}
}

Reverse

The Reverse method inverts the order of the elements in a sequence. This method follows the deferred execution pattern. Unlike OrderBy, this sorting method does not consider the actual values themselves in determining the order. Rather, it just returns the elements in the reverse order from which they are produced by the underlying source. The signature of the Reverse extension method is:

public static IEnumerable<TSource> Reverse<TSource>(this IEnumerable<TSource> source)

In the example provided in Listing 12-16, the Reverse method reverses the original sequence.

Listing 12-16. Example of the Reverse Method

using System;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> numbers = new List<int>() { 1, 2, 3, 4, 5 };
var reverseNumbers = numbers.Reverse();

var result = reverseNumbers.ToList();

result.ForEach(x => Console.Write("{0}\t", x));
Console.WriteLine();
}
}
}

The program in Listing 12-16 will produce the output:

5 4 3 2 1

Internal Operation of the Reverse Method of Execution

Let’s analyze the code in Listing 12-16 carefully to really understand what’s happening when the CLR executes the Reverse method.

1. Step 1: The CLR will pass the original sequence, in this case the numbers, as input to the Reverse method and it instantiates an instance of the ReverseIterator<TSource>, which holds the related information to execute by the ToList or ForEach method.

2. Step 2: The CLR will pass the ReverseIterator<TSource> instance to the ToList method, which passes this iterator to the List class and processes the operation based on the iteration logic implemented in the ReverseIterator<TSource>. Finally, this method produces the reversed sequence as output, as shown in Listing 12-17.

Listing 12-17. Implementation of the ReverseIterator<TSource>

private static IEnumerable<TSource> ReverseIterator<TSource>(
IEnumerable<TSource> source)
{
/* Copy the original list into this Buffer instance. Buffer is a struct
* which hold input list into a internal array. */
Buffer<TSource> iteratorVariable0 = new Buffer<TSource>(source);

/* index will hold the last index of the buffer */
int index = iteratorVariable0.count - 1;

while (true)
{
if (index < 0)
yield break;

/* So the item at the last index from internal array of the Buffer
* will be return and index will be decremented as long as the
* while loop continue. */
yield return iteratorVariable0.items[index];
index--;
}
}

The Buffer struct has the definition:

.class private sequential ansi sealed beforefieldinit Buffer<TElement> extends [mscorlib]System.ValueType
{

.method assembly hidebysig specialname rtspecialname instance void
.ctor(class
[mscorlib]System.Collections.Generic.IEnumerable'1<!TElement>
source) cil managed {/* Code removed */ }

.method assembly hidebysig instance !TElement[] ToArray() cil managed
{ /* Code removed */ }
.field assembly int32 count

/* items array will hold the data from the sequence*/
.field assembly !TElement[] items
}

Grouping- and Joining-Based Methods

This section will examine different grouping- and joining-based methods, such as Join, GroupJoin, and GroupBy.

Join

The Join operator is used to merge two sequences into one based on the joining condition. Figure 12-6 demonstrates the Join operation.

images

Figure 12-6. Join basic operation

The Join operation works through the following steps:

1. Step 1: It constructs a grouping table using the inner sequence and inner key selector.

2. Step 2: It will iterate through the outer sequence and match with the grouping table to determine the matched item, and the matched item will then be passed to the result selector to processed the matched item and store it in the list of <>f__AnonymousType0<string,string> type. The type<>f_AnonymousType0<string,string> is constructed based on the anonymous type passed into the result selector.

The signature of the Join operator is:

public static IEnumerable<TResult> Join<TOuter, TInner, TKey, TResult>(
this IEnumerable<TOuter> outer, IEnumerable<TInner> inner,
Func<TOuter, TKey> outerKeySelector, Func<TInner, TKey> innerKeySelector,
Func<TOuter, TInner, TResult> resultSelector)

public static IEnumerable<TResult> Join<TOuter, TInner, TKey, TResult>(
this IEnumerable<TOuter> outer, IEnumerable<TInner> inner,
Func<TOuter, TKey> outerKeySelector, Func<TInner, TKey> innerKeySelector,
Func<TOuter, TInner, TResult> resultSelector, IEqualityComparer<TKey> comparer
)

An example of the Join operation is presented in Listing 12-18.

Listing 12-18. Implementation of the Join Operation

using System;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
List<Person> persons;
List<Address> addresses;

InitializeData(out persons, out addresses);

/* persons - Outer Sequence */
var result = persons.Join(
/*addresses - Inner Sequence*/
addresses,
/* Outer Key Selector */
person => person,
/* Inner Key Selector */
address => address.AddressOf,
/* Result Selector */
(person, address) =>
new
{
PersonName = person.PersonName,
AddressDetails = address.AddressDetails
}
);

result.ToList().
ForEach(personAddress =>
Console.WriteLine("{0} \t{1}",
personAddress.PersonName, personAddress.AddressDetails));
}

private static void InitializeData(
out List<Person> persons, out List<Address> addresses)
{
var personA = new Person
{ PersonID = "PA_01", PersonName = "A" };
var personB = new Person
{ PersonID = "PB_01", PersonName = "B" };
var personC = new Person
{ PersonID = "PC_01", PersonName = "C" };

var addressOne = new Address
{ AddressOf = personA,
AddressDetails = "Mystery Street,Jupiter" };
var addressTwo = new Address
{ AddressOf = personA, AddressDetails = "Dark Street,Mars" };
var addressThree = new Address
{ AddressOf = personB, AddressDetails = "Sun Street,Jupiter" };
var addressFour = new Address
{ AddressOf = personC, AddressDetails = "Dry Street,Neptune" };

persons = new List<Person>
{ personA, personB, personC };
addresses = new List<Address>
{ addressOne, addressTwo, addressThree, addressFour };
}
}

public class Person
{
public string PersonID { get; set; }
public string PersonName { get; set; }
}

public class Address
{
public Person AddressOf { get; set; }
public string AddressDetails { get; set; }
}
}

The program in the Listing 12-18 produced the output:

A Mystery Street,Jupiter
A Dark Street,Mars
B Sun Street,Jupiter
C Dry Street,Neptune

Internal Operation of the Join Method of Execution

Let’s analyze the code in Listing 12-18 to help us understand in depth what’s happening while the CLR executes Join. Figure 12-7 demonstrates the inner details of the Join method.

images

Figure 12-7. How the Join method works

Each of the steps illustrated in Figure 12-7 will be explored in details in the following discussion.

1. Step 1: The CLR passes the related data, such as outer sequence, inner sequence, key, and result selectors, into the Join method, which selects the appropriate iterator, as demonstrated in Listing 12-19.

Listing 12-19. Implementation of the Join Method

public static IEnumerable<TResult> Join<TOuter, TInner, TKey, TResult>(
this IEnumerable<TOuter> outer,
IEnumerable<TInner> inner,
Func<TOuter, TKey> outerKeySelector,
Func<TInner, TKey> innerKeySelector,
Func<TOuter, TInner, TResult> resultSelector,
IEqualityComparer<TKey> comparer)
{

/* JoinIterator will be returned which holds all the necessary information
* to execute while enumerated either via ToList() or ForEach method*/
return JoinIterator<TOuter, TInner, TKey, TResult>(
outer,
inner,
outerKeySelector,
innerKeySelector,
resultSelector,
comparer);
}

Listing 12-19 shows that the Join method returns the JoinIterator to the caller that encapsulates related information. The caller of the Join method will not execute the JoinIterator immediately.

2. Step 2: Due to the deferred execution, as soon as the caller of the Join method calls the ToList method, the CLR will start executing the JoinIterator. Listing 12-20 shows the implementation code for the JoinIterator.

Listing 12-20. Implementation of the Join Operation

private static IEnumerable<TResult> JoinIterator<TOuter, TInner, TKey, TResult>(
IEnumerable<TOuter> outer,
IEnumerable<TInner> inner,
Func<TOuter, TKey> outerKeySelector,
Func<TInner, TKey> innerKeySelector,
Func<TOuter, TInner, TResult> resultSelector,
IEqualityComparer<TKey> comparer)
{
Lookup<TKey, TInner> iteratorVariable0 =
Lookup<TKey, TInner>.CreateForJoin(inner, innerKeySelector, comparer);

foreach (TOuter iteratorVariable1 in outer)
{
Lookup<TKey, TInner>.Grouping grouping =
iteratorVariable0.GetGrouping(outerKeySelector(iteratorVariable1),
false);

if (grouping != null)
{
for (int i = 0; i < grouping.count; i++)
yield return resultSelector(iteratorVariable1,
grouping.elements[i]);
}
}
}

Listing 12-20 shows that the CLR will initialize a Lookup class using the CreateForJoin method, which holds the default initialized groupings array of Grouping type. The CLR will iterate through the original inner sequence and, based on the inner keySelector, add the iterated item into the groupings array of the Lookup class. The implementation of the CreateForJoin method is:

internal static Lookup<TKey, TElement> CreateForJoin(
IEnumerable<TElement> source,
Func<TElement, TKey> keySelector,
IEqualityComparer<TKey> comparer)
{
/* It will initialize an array of Grouping into the groupings
* with the default size 7 */
Lookup<TKey, TElement> lookup = new Lookup<TKey, TElement>(comparer);
foreach (TElement local in source)
{
TKey key = keySelector(local);
if (key != null)
{
/* Add the relevant key into the groupings array */
lookup.GetGrouping(key, true).Add(local);
}
}
return lookup;
}

3. Step 3: The CLR will iterate through the outer sequence and, using the outer key selector, it will get relevant items from the grouping table created for the inner sequence.

4. Step 4: The CLR yields the resultSelector to get the result.

GroupJoin

GroupJoin works the same as the Join operator but it produces the hierarchy output. The signature of the GroupJoin method is:

public static IEnumerable<TResult> GroupJoin<TOuter, TInner, TKey, TResult>( this IEnumerable<TOuter> outer, IEnumerable<TInner> inner,
Func<TOuter, TKey> outerKeySelector, Func<TInner, TKey> innerKeySelector,
Func<TOuter, IEnumerable<TInner>, TResult> resultSelector)

public static IEnumerable<TResult> GroupJoin<TOuter, TInner, TKey, TResult>(
this IEnumerable<TOuter> outer, IEnumerable<TInner> inner,
Func<TOuter, TKey> outerKeySelector, Func<TInner, TKey> innerKeySelector,
Func<TOuter, IEnumerable<TInner>, TResult> resultSelector, IEqualityComparer<TKey> comparer)

The modified version of a Join example is used to demonstrate the use of the GroupJoin operator, as showing in Listing 12-21.

Listing 12-21. Example of the GroupJoin

static void Main(string[] args)
{
List<Person> persons;
List<Address> addresses;
InitializeData(out persons, out addresses);

/* persons - Outer Sequence */
var result = persons.GroupJoin(
/*addresses - Inner Sequence*/
addresses,
/* Outer Key Selector */
person => person,
/* Inner Key Selector */
address => address.AddressOf,
/* Result Selector */
(person, address) =>
new
{
PersonName = person.PersonName,
AddressDetails =
address.Select(innerAddress=>innerAddress.AddressDetails)
}
);

var rr = result.ToList();
foreach (var item in result)
{
Console.WriteLine("{0}", item.PersonName);
item.AddressDetails.ToList().ForEach(
address => Console.WriteLine(address));
}
}

It will produce the output:

A
Mystery Street,Jupiter
Dark Street,Mars
B
Sun Street,Jupiter
C
Dry Street,Neptune

GroupBy

The GroupBy method groups the elements of a sequence based on the specified key selector. This method has eight overloaded methods, and one of those is:

public static IEnumerable<TResult> GroupBy<TSource, TKey, TElement, TResult>( this IEnumerable<TSource> source, Func<TSource, TKey> keySelector,
Func<TSource, TElement> elementSelector, Func<TKey, IEnumerable<TElement>, TResult> resultSelector,
IEqualityComparer<TKey> comparer)

An example of the GroupBy method is shown in Listing 12-22.

Listing 12-22. Example of the GroupBy Method

using System;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
List<Person> persons;
InitializeData(out persons);

var result = persons.GroupBy(
person => person.PersonAge,
person => person.PersonID,
(Age, Id) =>
new
{
PersonAge = Age,
PersonID = Id
}
);

Console.WriteLine("Age group \t No of person \t Persons are");
result.ToList().ForEach(item =>
Console.WriteLine(
string.Format("{0,5} \t {1,15} \t {2,-33}",
item.PersonAge,
item.PersonID.Count(),
string.Join(",", item.PersonID))));
}

private static void InitializeData(
out List<Person> persons)
{
persons = new List<Person>
{
new Person { PersonID = "PA_01", PersonAge = 6 },
new Person { PersonID = "PB_01", PersonAge = 7 },
new Person { PersonID = "PC_01", PersonAge = 7 },
new Person { PersonID = "PD_01", PersonAge = 4 },
new Person { PersonID = "PE_01", PersonAge = 7 },
new Person { PersonID = "PF_01", PersonAge = 5 },
new Person { PersonID = "PG_01", PersonAge = 5 },
new Person { PersonID = "PH_01", PersonAge = 9 },
new Person { PersonID = "PI_01", PersonAge = 9 }
};
}
}

public class Person
{
public string PersonID { get; set; }
public int PersonAge { get; set; }
}
}

This program will produce the output:

Age group No of person Persons are
6 1 PA_01
7 3 PB_01,PC_01,PE_01
4 1 PD_01
5 2 PF_01,PG_01
9 2 PH_01,PI_01

Internal Operation of the GroupBy Method of Execution

Let’s analyze Listing 12-22 to help us understand what’s happening while the CLR executes the GroupBy method. Figure 12-8 demonstrates the details about the GroupBy method.

images

Figure 12-8. Working details of the GroupBy method

Each of the steps in Figure 12-8 will be explored in detail in the following discussion.

1. Step 1: The C# compiler will construct a method <Main>b__1 using the anonymous method person => person.PersonAge.Name, <Main>b__2 using the anonymous method person => person.PersonID, and <Main>b__3 using the anonymous method as shows below,

(Age, Id) =>
new
{
PersonAge = Age,
PersonID = Id
}

The CLR will instantiate three instances of the MulticastDelegate object using the <Main>b__1, <Main>b__2, and <Main>b__3 methods to pass this into the GroupedEnumerable class as input to keySelector, elementSelector, and the resultSelector from the GroupBy method:

public static IEnumerable<TResult> GroupBy<TSource, TKey, TElement, TResult>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
Func<TSource, TElement> elementSelector,
Func<TKey, IEnumerable<TElement>, TResult> resultSelector)
{
return new GroupedEnumerable<TSource, TKey, TElement, TResult>(
source,
keySelector,
elementSelector,
resultSelector, null);
}

2. Step 2: While the CLR instantiates the GroupedEnumerable class in the constructor, it will initialize related data structures such as source (original sequence), keySelector, elementSelector, and resultSelector and will hold the keySelector, elementSelector, and resultSelector from the input of the constructor.

3. Step 3: Due to the deferred execution pattern followed by the GroupBy method, the CLR calls the ToList method, which will initialize a new list using the enumerator by calling the GetEnumerator from the GroupedEnumerable class instantiated in step 1. The implementation of the GetEnumeratormethod is:

public IEnumerator<TResult> GetEnumerator()
{
/* source - refers the Original sequence
* keySelector - refers The Key selector
* elementSelector - refers The Element selector
* comparer - refers The comparer
* ApplyResultSelector - apply the result selector to extract the result */
return Lookup<TKey, TElement>.Create<TSource>(
this.source,
this.keySelector,
this.elementSelector,
this.comparer).
/* Apply the result selector to extract the result */
ApplyResultSelector<TResult>(this.resultSelector).GetEnumerator();
}

The GetEnumerator method shows that the Lookup class will hold the grouped result after executing the Create and ApplyResultSelector.

4. Step 4: Internally the Create method of the Lookup class will instantiate an instance of the Lookup class, which will initialize a data structure groupings (an array with default size 7) with the type Grouping, as shown in the partial code of the constructor of the Lookup class:

private Lookup(IEqualityComparer<TKey> comparer)
{
/* Code removed */
this.groupings = new Grouping<TKey, TElement>[7];
}

This groupings array will initially hold default values of the specified type. The CLR then iterates through the original list. Based on the KeySelector and ElementSelector, it will extract data from the original list and add these iterated items into the groupings array:

foreach (TSource local in source)
{
lookup.GetGrouping(keySelector(local), true).Add(elementSelector(local));
}

On return of the Create method, the CLR will return the instance of the Lookup class that holds the Lookup table for the original sequence. The implementation of the code for the Create method in the Lookup class is:

internal static Lookup<TKey, TElement> Create<TSource>( IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
Func<TSource, TElement> elementSelector,
IEqualityComparer<TKey> comparer)
{
Lookup<TKey, TElement> lookup = new Lookup<TKey, TElement>(comparer);
foreach (TSource local in source)
{
lookup.GetGrouping(keySelector(local),
true).Add(elementSelector(local));
}
return lookup;
}

Set-Based Methods

This section will explore the different set-based methods, such as Distinct, Except, Union, and Intersect.

Distinct

The Distinct method returns distinct elements from a sequence. The signature of this extension method is:

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource>
source)
public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource>
source, IEqualityComparer<TSource> comparer)

These extension methods use the default equality comparer to compare values. The first method returns distinct elements from a sequence, and the second method returns distinct elements from a sequence by using a specified IEqualityComparer<T> to compare values.

The Distinct extension method returns identical items from the list, for example, the program in Listing 12-23 determines the Distinct items from the sequence {1,1,1,2,2,2,3,3,3}.

Listing 12-23. Example of the Distinct Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> numbers = new List<int>()
{
1,1,1,2,2,2,3,3,3
};

var distinctedNumbers = numbers.Distinct().ToList();
distinctedNumbers.ForEach(x => Console.Write(string.Format("{0}\t", x)));
}
}
}

This program will produce the output:

1 2 3

Internal Operation of the Distinct Method of Execution

Let’s analyze Listing 12-23 to help us understand what’s happening while the CLR executes the Distinct method. Figure 12-9 demonstrates the details of the Distinct method in Linq.

images

Figure 12-9. Distinct method working details

Each step from Figure 12-9 will be explored in further detail in the following discussion.

1. Step 1: The CLR will pass the original list to the Distinct method as input, which will later instantiate an instance of DistinctIterator by encapsulating the original list.

2. Step 2: From the ToList method, the CLR calls the List class by passing the DistinctIterator instance, instantiated in step 1, as input to it. Inside the List class, it will instantiate an instance of the Set<TSource>, which will be used for temporary sequence storage and iterates it through theDistinctIterator, as demonstrated in Listing 12-24.

Listing 12-24. Implementation Code of the Distinct Method

private static IEnumerable<TSource> DistinctIterator<TSource>(
IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)
{
Set<TSource> iteratorVariable0 = new Set<TSource>(comparer);
foreach (TSource iteratorVariable1 in source)
{
if (iteratorVariable0.Add(iteratorVariable1))
{
/* Only execute this line when able to add item in the Set */
yield return iteratorVariable1;
}
}
}

While the CLR iterates through the original list, it will add the iterated item in the Set<TSource> instance it created earlier. Internally the Set<TSource> class uses the Add and Find methods to add the item from the given sequence into the internal array slots only when there is no duplicate item in theslots array.

The slots is an array of the Slot type, which is defined in the System.Linq namespace of the System.Core.dll assembly. The following code was extracted via the ildasm.exe:

.class sequential ansi sealed nested assembly beforefieldinit
Slot<TElement>
extends [mscorlib]System.ValueType
{
.field assembly int32 hashCode
.field assembly int32 next
/* This field will hold the value */
.field assembly !TElement value
}

When there is a duplicate item in the Set, the Add method does not add that item in the slots and it will continue until the CLR reaches the end of the list iteration and produces a list with distinct items.

Except

The Except extension method is used to remove a list of items from another list of items. It produces the set difference of two sequences. For example, if you have a list of items {1,2,3,4,5,6,7} and another one with {1,2,3}, the Except of these two list will produce {4,5,6,7}.

The signature of this extension method is:

public static IEnumerable<TSource> Except<TSource>(
this IEnumerable<TSource> first, IEnumerable<TSource> second)
public static IEnumerable<TSource> Except<TSource>(
this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)

These two extension methods will use the default equality comparer to compare values. The first version produces the set difference of two sequences, and the second version produces the set difference of two sequences by using the specified IEqualityComparer<T> to compare values.

The program in Listing 12-25 demonstrates the use of the Except method.

Listing 12-25. Example of the Except Method

using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> firstNumbers = new List<int>()
{
1,2,3,4,5,6,7
};
IList<int> secondNumbers = new List<int>()
{
1,2,3
};

var result = firstNumbers.Except(secondNumbers).ToList();
result.ForEach(x => Console.Write(string.Format("{0}\t",x)));
}
}
}

The program in Listing 12-25 will produce the output:

4 5 6 7

Internal Operation of the Except Method of Execution

Let’s analyze Listing 12-25 to help us understand what’s happening when the Except method is executed. Figure 12-10 demonstrates the inner details of the Except method.

images

Figure 12-10. Except method working details

Each of the steps from Figure 12-10 will be explored in detail in the following discussion.

1. Step 1: The CLR passes the original list to the Except method, which will internally instantiate the ExceptIterator<TSource> by encapsulating the original list inside the iterator. This iterator will not execute due to the deferred execution until it calls the ToList method or ForEach method applied over the iterator object.

2. Step 2: When the CLR executes the ToList method, it calls the List class by passing the ExceptIterator instance created in step 1 as input. The List class calls the ExceptIterator method while it iterates through the list.

The ExceptIterator method creates a new instance of the Set<TSource> and adds all the items from the second list, as demonstrated in Listing 12-26.

Listing 12-26. Implementation Code of the ExceptIterator

private static IEnumerable<TSource> ExceptIterator<TSource>(
IEnumerable<TSource> first,
IEnumerable<TSource> second,
IEqualityComparer<TSource> comparer)
{
Set<TSource> iteratorVariable0 = new Set<TSource>(comparer);
foreach (TSource local in second)
iteratorVariable0.Add(local);
foreach (TSource iteratorVariable1 in first)
{
if (!iteratorVariable0.Add(iteratorVariable1))
continue;
yield return iteratorVariable1;
}
}

In the second loop, the CLR iterates through the first list and tries to add iterated items in the set. If the item of the second list does not exist (!iteratorVariable0.Add(iteratorVariable1)) in the set, it will then return that item and continue until it finishes the first list.

Union

The Union method (denoted as ∪) sets the union in two sequences. The Union method excludes duplicates from the output sequence. For example, if you have two sets, A = {1,2,3,4,5,6,7} and B = {5,6,7,8,9}, the union of these sets is A ∪ B = {1,2,3,4,5,6,7,8,9}, as demonstrated in Figure 12-11.

images

Figure 12-11. Union operation

The signature of the Union method is:

public static IEnumerable<TSource> Union<TSource>(
this IEnumerable<TSource> first, IEnumerable<TSource> second)
public static IEnumerable<TSource> Union<TSource>(
this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)

The program in Listing 12.27 demonstrates the usage of the Union operation.

Listing 12-27. Example of the Union Operation Using the Union Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> firstList = new List<int>()
{
1,2,3,4
};

IList<int> secondList = new List<int>()
{
7,9,3,4,5,6,7
};

var result = firstList.Union(secondList);
result.ToList().ForEach(x =>
Console.Write(string.Format("{0}\t",x)));
}
}
}

The program in Listing 12-27 will produce the output:

1 2 3 4 7 9 5 6

Internal Operation of the Union Method of Execution

To execute the Union method, the CLR will follow these steps:

1. Step 1: The Union method returns the UnionIterator<TSource>, which holds the firstList and the secondList, null for the IEqualityComparer, because the program in Listing 12-27 does not use a comparer to compare the items.

2. Step 2: Due to the deferred execution, the UnionIterator<TSource> will be executed when the CLR starts executing the ToList method. Inside the UnionIterator<TSource>, a new instance of the Set<TSource> class is instantiated, which will be used to find the distinct items from both lists, as demonstrated in Listing 12-28.

Listing 12-28. Implementation Code for the UnionIterator

private static IEnumerable<TSource> UnionIterator<TSource>(
IEnumerable<TSource> first,
IEnumerable<TSource> second,
IEqualityComparer<TSource> comparer)
{
Set<TSource> iteratorVariable0 = new Set<TSource>(comparer);

foreach (TSource iteratorVariable1 in first)
{
/* If the CLR able to add the iterated item from the first List
* in the Set it will yield that iterated item and keep continue
* the loop. This will make sure only the identical items from the
* first list is being returned.*/
if (iteratorVariable0.Add(iteratorVariable1))
yield return iteratorVariable1;
}
foreach (TSource iteratorVariable2 in second)
{
/* If the CLR not able to add iterated item from the second list
* (to make sure there is no duplicate between first list and second
* list) it will continue the loop until able to add the iterated
* item from the second list into the Set. If so then that item will
* be returned and it will continue the iteration until finish
* the Second List.*/
if (!iteratorVariable0.Add(iteratorVariable2))
continue;
yield return iteratorVariable2;
}
}

Intersect

The Intersect extension method produces the set intersection of two sequences. For example, if you have a list A with items {1,2,3,4,5} and B with {4,5}, the intersection of these two list A and B is {4,5}, as shown in Figure 12-12.

images

Figure 12-12. Intersect operation

The method signature for this extension method is:

public static IEnumerable<TSource> Intersect<TSource>(
this IEnumerable<TSource> first, IEnumerable<TSource> second)
public static IEnumerable<TSource> Intersect<TSource>(
this IEnumerable<TSource> first, IEnumerable<TSource> second,IEqualityComparer<TSource> comparer)

This extension method produces the set intersection of two sequences by using the default equality comparer to compare values, and it will produce the set intersection of two sequences by using the specified IEqualityComparer<T> to compare values.

The program in Listing 12-29 creates two list: listA with 1,2,3,4,5 and listB with 4,5. It shows the intersect operation between these two list.

Listing 12-29. Example of the Intersect Method

using System;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> listA = new List<int>() { 1, 2, 3, 4, 5 };
IList<int> listB = new List<int>() { 4, 5 };

var intersectResult = listA.Intersect(listB);

intersectResult.ToList().ForEach(x => Console.Write("{0}\t",x));
}
}
}

This program produced the output:

4 5

Internal Operation of the Intersect Method of Execution

While executing the Intersect method, the CLR follows these steps:

1. Step 1: The CLR initialize the IntersectIterator<TSource> with the original lists and returns to the caller the instance of the IntersectIterator<TSource>. Due to the deferred execution, this iterator will not execute until the ToList method is called.

2. Step 2: While executing the IntersectIterator<TSource>, the CLR instantiates an instance of the Set<TSource> that is used to hold the second list, demonstrated in the implementation of the IntersectIterator as shown in Listing 12-30.

Listing 12-30. Implementation Code of the InteresectIterator Method

private static IEnumerable<TSource> IntersectIterator<TSource>(
IEnumerable<TSource> first,
IEnumerable<TSource> second,
IEqualityComparer<TSource> comparer)
{
Set<TSource> iteratorVariable0 = new Set<TSource>(comparer);

/* The CLR will add all the items from the second list to the Set.*/
foreach (TSource local in second)
iteratorVariable0.Add(local);

/* Iterate though the first list */
foreach (TSource iteratorVariable1 in first)
{
/* If the CLR able to remove based on the iterated item from the first
* list i.e. there is a same item in the second list and that will
* return otherwise continue the operation.*/
if (!iteratorVariable0.Remove(iteratorVariable1))
continue;

yield return iteratorVariable1;
}
}

The CLR will then iterate through the first list and try to remove each of the items of the first list from the Set<TSource> that is holding the items of the second list, as mention earlier. If it can remove it, it will then return that item of the first list, otherwise it will continue to iterate through the first list until it finishes the first list.

Aggregation-Based Methods

This section will explore in detail the different aggregation-based methods, such as Sum, LongCount, Max, Min, Count, Average, and Aggregate.

Sum

To do the sum operation over the items of a list, you can use the Sum method. The Sum method has 20 overloaded methods of which 10 are instance and rest static methods. The signature of the Sum methods is:

public static int Sum( this IEnumerable<int> source)
public static int Sum<TSource>(this IEnumerable<TSource> source, Func<TSource, int> selector)

An example of the Sum extension method is presented in Listing 12-31.

Listing 12-31. Example of the Sum Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> numbers = new List<int>()
{
1,2,3,4,5,6,7,8,9,10
};

Console.WriteLine("Sum of the numbers :{0}", numbers.Sum());

Console.WriteLine("Sum of the original numbers x 2 :{0}",
numbers.Sum(x => x * 2));
}
}
}

This program produces the output:

Sum of the numbers :55
Sum of the original numbers x 2 :110

Internal Operation of the Sum Method of Execution

To execute the first overloaded Sum extension method used in Listing 12-31, the CLR follows these steps:

1. Step 1: The CLR passes the original list as input to the Sum extension method.

2. Step 2: The Sum method loops through the list, performs the summation of each of the items, and produces the result, as demonstrated in this implementation code:

public static int Sum(this IEnumerable<int> source)
{
int num = 0;
foreach (int num2 in source)
num += num2;
return num;
}

Let’s analyze Listing 12-31 to help us understand what’s happening while the CLR executes the second overloaded Sum extension method. Figure 12-13 demonstrates the details of the Sum method.

images

Figure 12-13. How the Sum method works

Each of the steps in Figure 12-13 will be explored in detail in the following discussion.

1. Step 1: The C# compiler constructs the method <Main>b__1 for the anonymous method block (x => x * 2), as shown in the following IL code:

.method private hidebysig static int32 <Main>b__1(int32 x) cil managed
{
/* Code removed */
L_0000: ldarg.0
L_0001: ldc.i4.2
/* Multiply the argument at position 0 of the evaluation stack by 2*/
L_0002: mul
L_0003: stloc.0
L_0004: br.s L_0006
L_0006: ldloc.0
L_0007: ret
}

The CLR creates a MulticastDelegate instance using the method <Main>b__1, as shown in the following IL code extracted from the decompiled Main method of the program in Listing 12-31 using the .NET Reflector tool:

L_007f: ldftn
int32
Ch12.Program::<Main>b__1(int32)
L_0085: newobj instance
void
[mscorlib]System.Func'2<int32, int32>::.ctor(object, native int)

images Note: The System.Func<TSource,TResult> is derived from the MulticastDelegate.

2. Step 2: The CLR passes the original list and MulticastDelegate instance created in step 1 as input to the Sum method, which calls the Select method with the original list and the instance of the MulticastDelegate as input. The CLR instantiates the relevant iterator, for example,WhereSelectListIterator<TSource, TResult>, from the Select method and returns it back to the Sum method, for example, WhereSelectListIterator<TSource, TResult>. The CLR then calls the overloaded Sum method. The implementation of this overloaded Sum would be:

public static float Sum(
this IEnumerable<float> source)
{
double num = 0.0;
foreach (float num2 in source)
num += num2;
return (float) num;
}

3. Step 3: The CLR gets the iterator from the IEnumerable object passed to the Sum method, for example, the instance of the WhereSelectListIterator, as shown in the following decompiled Sum method code:

.method public hidebysig static int32 Sum(
class [mscorlib]System.Collections.Generic.IEnumerable'1<int32> source)
cil managed
{
/* Code removed */
/* Load the argument which is the IEnumerable passed one evaluation
* stack from the caller */
L_0010: ldarg.0

L_0011: callvirt instance
/* return type of the GetEnumerator method */
class [mscorlib]System.Collections.Generic.IEnumerator'1<!0>
[mscorlib]System.Collections.Generic.IEnumerable'1<int32>::
GetEnumerator()
/* Code removed */
}

4. Step 4: Using this Enumerator, the CLR iterates through the items from the original list. While the CLR executes the iterator, it iterates through each of the items by calling the MoveNext method from the relevant enumerator and executes the relevant selector method passed as input to it. The following anonymous method is used to modify the item while it is being retrieved from the original list and just before committing to the sum for that item:

( x => x * 2 ) /* compiled as <Main>b__1 used in the MoveNext method
* as Selector*/

The implementation of the MoveNext method of WhereSelectListIterator<TSource, TResult> class would be:

public override bool MoveNext()
{
while (this.enumerator.MoveNext())
{
TSource current = this.enumerator.Current;
if ((this.predicate == null) || this.predicate(current))
{
/* selector(current) will actually execute as
* <Main>b__1(current) */
base.current = this.selector(current);
return true;
}
}
return false;
}

From there you find that the this.selector(current) statement executes the selector for the current item from the original list and continues unless it has finished the iteration of the original list.

LongCount

The LongCount method can be used to return an Int64 that represents the number of elements in a sequence. The method signature for this extension method is:

public static long LongCount<TSource>(this IEnumerable<TSource> source)
public static long LongCount<TSource>(this IEnumerable<TSource> source,
Func<TSource, bool> predicate)

This extension method will return an Int64 that represents the total number of elements in a sequence, and it will return an Int64 that represents how many elements in a sequence satisfy a condition.

Listing 12-32 presents an example of the LongCount method.

Listing 12-32. Example of the LongCount Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> firstList = new List<int>()
{
1,2,3,4
};

Console.WriteLine(firstList.LongCount());
}
}
}

This program will produce the output:

4

Internal Operation of the LongCount Method of Execution

The implementation of the first version of the LongCount method shows that the CLR will iterate through the original list and sum each of the items and store the results in the num variable:

public static long LongCount<TSource>(this IEnumerable<TSource> source) {
long num = 0L;
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
while (enumerator.MoveNext())
{
num += 1L;
}
}
return num;
}

The implementation of the second version of the LongCount extension method shows that the CLR iterates through the original list and selects items from the list based on a provided predicate (which will return those item that meet the condition) and sums up and stores this in the num variable:

public static long LongCount<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) {
long num = 0L;
foreach (TSource local in source)
{
if (predicate(local))
num += 1L;
}
return num;
}

Max

The Max extension method is used to determine the maximum number from a list. The signature of the Max method has 22 overloaded methods, with two of those being:

public static int Max(this IEnumerable<int> source)
public static decimal Max<TSource>(this IEnumerable<TSource> source,
Func<TSource, decimal> selector)

Listing 12-33 presents an example of the Max method.

Listing 12-33. Example of Max Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> numbers = new List<int>()
{
1,2,3,4,5,6,7,8,9,10
};

Console.WriteLine("Max of the numbers :{0}", numbers.Max());
Console.WriteLine("Max of the original numbers x2 :{0}",
numbers.Max(x => x * 2));
}
}
}

This program produces the output:

Max of the numbers :10
Max of the original numbers x2 :20

Internal Operation of the Max Method of Execution

The CLR performs the followings steps to execute the Max operation:

1. Step 1: The CLR pass the original list as input to the Max method.

2. Step 2: The Max method loops through the list and performs the Max operation, as demonstrated here by the implementation of the Max:

public static int Max(
this IEnumerable<int> source)
{
int num = 0;
bool flag = false;
foreach (int num2 in source)
{
if (flag)
{
if (num2 > num)
num = num2;
}
else
{
num = num2;
flag = true;
}
}
return num;
}

The second overloaded method of the Max extension method works as demonstrated in Figure 12-14.

images

Figure 12-14. How the Max method works

The CLR follows these steps in executing the Max method:

1. Step 1: The C# compiler constructs a method <Main>b__1 using the anonymous method (x => x * 2). The CLR passes this <Main>b__1 method to the MulticastDelegate class to instantiate an instance of it and calls the Select method of the list, which takes the original list and the instance of theMulticastDelegate as input and returns the relevant iterator instance, for example, WhereSelectListIterator<TSource,TResult>, for the list as output.

2. Step 2: The CLR then calls the overload Max method, which accepts only the iterator returned from step 1. In this overload Max method, a ForEach method is forced to iterate through the list and perform the Max operation based on the Selector condition.

Min

The Min extension method will determine the minimum of the list. Two signatures of the extension methods Min (of the 22 overloaded methods) are:

public static int Min(this IEnumerable<int> source) public static int Min<TSource>(this IEnumerable<TSource> source, Func<TSource, int> selector)

The program in Listing 12-34 demonstrates the usage of the Min extension method.

Listing 12-34. Example of Min Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> numbers = new List<int>()
{
1,2,3,4,5,6,7,8,9,10
};

Console.WriteLine("Min of the numbers :{0}", numbers.Min());

Console.WriteLine("Min of the original numbers x2 :{0}",
numbers.Min(x => x * 2));
}
}
}

This program will produce the output:

Min of the numbers :1
Min of the original numbers ×2 :2

Internal Operation of the Min Method of Execution

When the CLR finds the first version of the Min extension method, as shown in Listing 12-34, it follows the steps to perform the operation. The CLR passes the original list as input to the Min extension method. The Min method then loops through the list and performs the minimum calculation operation.

The second version of the Min extension method is demonstrated in Figure 12-15.

images

Figure 12-15. Min method working details

The CLR follows these steps:

1. Step 1: The C# compiler constructs a method <Main>b__1 using the anonymous method (x => x * 2) and passes this <Main>b__1 method to the MulticastDelegate class to instantiate an instance of it. It will then call the Select method by passing the original list and the instance of theMulticastDelegate as input, which returns the relevant iterator instance, for example, WhereSelectListIterator<TSource,TResult>, for the list as output.

2. Step 2: The CLR will call the overloaded Min method, which accepts only the iterator returned in step 1. In this overload Min method, a ForEach method forces it to iterate the list using the provided Selector and performs the Min operation.

Count

The Count method returns the number of elements in a sequence. The method signature for this method is:

public static int Count<TSource>(this IEnumerable<TSource> source)
public static int Count<TSource>(this IEnumerable<TSource> source,
Func<TSource, bool> predicate)

The first version returns the number of elements in a sequence. The second version returns a number that represents how many elements are in the specified sequence that satisfy a condition.

The Count method determines how many items are in the list. For example, the program in Listing 12-35 determines how many items are in listOne and also determines how many items in listOne have more than three characters.

Listing 12-35. Example of the Count Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<string> listOne = new List<string>()
{
"One","Two","Three"
};

var result = listOne.Count();

var fourOrMoreCharacters = listOne.Count(item => item.Length > 3);
Console.WriteLine("{0}\n{1}", result,fourOrMoreCharacters);
}
}
}

The program in Listing 12-35 will produce the output:

3
1

Internal Operation of the Count Method of Execution

Let’s analyze the code in Listing 12-35 to help us understand what’s happening when we use the Count method.

When the CLR finds the first overloaded instance of the Count method, it tries to determine the enumerator of the given list and iterates through the items (using the iterator of the list), unless the MoveNext method of the enumerator returns false.

The implementation of the Count method shown in Listing 12-36 returns the number of iteration as the output for a list when using the Count method.

Listing 12-36. Example of the Count Method

public static int Count<TSource>(this IEnumerable<TSource> source)
{
int num = 0;
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
while (enumerator.MoveNext())
{
/* num will be increased as per the iteration of the while*/
num++;
}
}
return num;
}

Figure 12-16 demonstrates the second version of the Count method.

images

Figure 12-16. Count method working details

The second version of the Count method will take the original list and a predicate to filter the items from the list and count based on the filtered items. The predicate will be created in the compile time based on the anonymous method provided (item => item.Length > 3). The CLR loops through the items of the list and executes the predicate over each of the items. If the predicate meets the condition over the items on iteration, it increases the item count.

Finally, it returns the item count as the total number of items that meet the condition. The implementation of the Count method is:

public static int Count<TSource>( this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
foreach (TSource local in source)
if (predicate(local))
num++;
return num;
}

Average

The Average method calculates the average of a sequence of numeric values. Two (of a total of 11 overloaded methods) of the signatures for this extension method are:

public static double Average(this IEnumerable<int> source)
public static decimal Average<TSource>(
this IEnumerable<TSource> source, Func<TSource, decimal> selector)

Listing 12-37 presents an example of the Average method.

Listing 12-37. Example of Average Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> numbers = new List<int>()
{
1,2,3,4,5,6,7,8,9,10
};
Console.WriteLine("Average of the numbers :{0}",
numbers.Average());

Console.WriteLine("Average of the original numbers x2 :{0}",
numbers.Average((x => x * 2)));
}
}
}

This program produces the output:

Average of the numbers :5.5
Average of the original numbers :11

Internal Operation of the Average Method of Execution

Listing 12-37 shows that the CLR passes the list numbers as input to the Average method. This method processes the original list by iterating through it and calculates the average of items of the numbers and returns the average of the items as output. The CLR performs the first version of the Averagemethod using these steps to execute it:

1 Step 1: The CLR pass the original list, for example, numbers for Listing 12-37, as input to the Average method.

2. Step 2: The Average method loops through the list and performs the average. The implementation of the Average method is shown in Listing 12-38.

Listing 12-38. Internal Operation of the Average Method

public static double Average(this IEnumerable<int> source)
{
long num = 0L;
long num2 = 0L;
foreach (int num3 in source)
{
num += num3;
num2 += 1L;
}

/* total of the numbers / to no of items in list */
return (((double) num) / ((double) num2));
}

The second version of the Average extension method follows the steps outlined in Figure 12-17 to perform the operation.

images

Figure 12-17. Average extension method working details

Figure 12-17 shows that the CLR passed the list numbers to the Average method along with the instance of MulticastDelegate class instantiated in the compile time using the anonymous method (x=>x*2). The CLR will pass this instance of the MulticastDelegate to the Select method from which the appropriate iterator will be instantiated. For the program in Listing 12-37, the WhereSelectListIterator<int,int> iterator will be instantiated and it will hold the original list and the selector inside. The CLR will iterate through the list and calculate the average based on the filtering criteria provided in the selector.

Let’s analyze Listing 12-37 to understand what’s happening when the CLR executes the second version of the Average method.

1. Step 1: The CLR passes the original list and the instance of the MulticastDelegate using the <Main>b__1 method (which is generated in the compile time by the C# compiler using the Anonymous method (x=>x*2)) as input to the Average method.

2. Step 2: While executing the Average method, the CLR calls the Select method, which instantiates the WhereSelectListIterator<int, int> iterator returned from the Select method.

3. Step 3: The CLR then calls the overloaded Average method by passing the WhereSelectListIterator<int, int> instantiated in step 2 as input. This iterator contains the original list and the instance of the MulticastDelegate as the selector.

4. Step 4: In the Average method, the ForEach method iterates through the list and performs the average calculation as demonstrated in the implementation code for the foreach block and performs the average calculation:

foreach (int num3 in source)
{
num += num3;
num2 += 1L;
}

Aggregate

The Aggregate method combines a series of items from the sequence, for example, if you have a sequence with One, Two, Three, Four and apply the Aggregate method over the sequence, the CLR then performs the operation as demonstrated in Figure 12-18.

images

Figure 12-18. Basics of the Aggregate operation

The signature of this method is:

public static TSource Aggregate<TSource>(
this IEnumerable<TSource> source, Func<TSource, TSource, TSource> func)
public static TAccumulate Aggregate<TSource, TAccumulate>(
this IEnumerable<TSource> source,
TAccumulate seed, Func<TAccumulate, TSource, TAccumulate> func)
public static TResult Aggregate<TSource, TAccumulate, TResult>(
this IEnumerable<TSource> source, TAccumulate seed,
Func<TAccumulate, TSource, TAccumulate> func, Func<TAccumulate, TResult> resultSelector)

Listing 12-39 demonstrates an example of the Aggregate method.

Listing 12-39. Example of the Aggregate Method

using System;
using System.Collections.Generic;
using System.Linq;
namespace Ch12
{
class Program
{
static void Main(string[] args)
{

List<string> numbers = new List<string>()
{
"One", "Two", "Three", "Four"
};

var result = numbers.Aggregate(
(aggregatedValue, nextItem) =>
nextItem + aggregatedValue);

Console.WriteLine("Aggregated value : {0}", result);
}
}
}

This program produces the output:

Aggregated value : FourThreeTwoOne

The output is different from that of the original list because of nextItem + aggregatedValue. Stated another way, the output would read as:

Aggregated value : OneTwoThreeFour

Internal Operation of the Aggregate Method of Execution

Figure 12-19 demonstrates the internal operation of the Aggregate method.

images

Figure 12-19. How the Aggregate method works

Let’s analyze the code in Listing 12-39 to understand what’s happening as we initiate the Aggregate method.

1. Step 1: The C# compiler creates the <Main>b__1 method to encapsulate the anonymous method body (aggregatedValue, nextItem) => nextItem + aggregatedValue) as <Main>b__1, which will be used as the Aggregator func of the Aggregate method. The CLR uses this anonymous method to execute Aggregate. The contents of the compiled <Main>b__1 method would be:

.method private hidebysig static string <Main>b__1
(string aggregatedValue, string nextItem) cil managed
{
/* Code removed */
L_0000: ldarg.1
L_0001: ldarg.0
L_0002: call string [mscorlib]System.String::Concat(string, string)
L_0007: stloc.0
L_0008: br.s L_000a
L_000a: ldloc.0
L_000b: ret
}

2. Step 2: When the CLR executes the Aggregate function, it takes the original list and anonymous method created in step 1 as input. It iterates through each of the items from the list and uses the first item from the list as the current aggregate value:

TSource current = enumerator.Current;

The second item (using enumerator.Current) from the list is used as input to the anonymous method (i.e., the aggregator to generate new aggregate value). This new value will be used as a new aggregated value for the next iteration and continues the iteration until it finishes the list. The implementation of the Aggregate method used in Listing 12-39 is shown in Listing 12-40.

Listing 12-40. Implementation Code for the Aggregate Method

public static TSource Aggregate<TSource>(
this IEnumerable<TSource> source,
Func<TSource, TSource, TSource> func)
{
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
/* current as the initial seed */
TSource current = enumerator.Current;
/* enumerator will move forward and start looping from the
* second item */
while (enumerator.MoveNext())
/* seed and iterated item will be passed to the Aggregator which
* is func to generate new seed. */
current = func(current, enumerator.Current);

return current;
}
}

The following discussion will explore the overloaded method of the Aggregate, which will take a value as input for the initial aggregate value. So the CLR will not take the first item from the list as the initial aggregate value as it did for the first version of the Aggregate method discussed earlier.Listing 12-41 demonstrates the use of the Aggregate method that takes an aggregate value as input.

Listing 12-41. Modified Version of Listing 12-39

using System;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
List<string> items = new List<string>()
{
"One", "Two", "Three", "Four"
};
var result = items.Aggregate(
/* Zero as seed, to use as the initial aggregate value */
"Zero",
(temporaryAggregatedValue, nextItem) =>
{
Console.WriteLine(temporaryAggregatedValue);
return nextItem + temporaryAggregatedValue;
},
aggregatedResult =>
string.Format("Final result : {0}",
aggregatedResult.ToUpper())
);
Console.WriteLine(result);
}
}
}

This program will produce the output:

Zero
OneZero
TwoOneZero
ThreeTwoOneZero
Final result : FOURTHREETWOONEZERO

The Aggregate method generated the seed using the Aggregator method based on the initial seed and iterated item. The final results will be produced with a specific format, for example, for the code in Listing 12-41, the anonymous method <Main>b__2 format of the aggregated method produces the final result as demonstrated in Figure 12-20.

images

Figure 12-20. How the Aggregate method works

The CLR will pass the initial aggregate value as input and the first item from the original list (for the first pass) to the <Main>b__1 to create the aggregate value, as shown in Listing 12-41. After performing the initial aggregate operation, <Main>b__1 will return the new aggregate value. After first pass while iterating the list, the CLR will replace the old aggregate value with the new one created using <Main>b__1 and continue until it finishes iterating the original list. The implementation of the Aggregate method is shown in Listing 12-42.

Listing 12-42. Implementation Code for the Aggregate Method

public static TAccumulate Aggregate<TSource, TAccumulate>(
this IEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate, TSource, TAccumulate> func)
{
TAccumulate local = seed;
foreach (TSource local2 in source)
local = func(local, local2);
return local;
}

The aggregate seed generated using the initial seed is:

(temporaryAggregatedValue, nextItem) =>
{
Console.WriteLine(temporaryAggregatedValue);
return nextItem + temporaryAggregatedValue;
}

This code is compiled into <Main>b__1:

.method private hidebysig static string <Main>b__1(
string temporaryAggregatedValue, string nextItem) cil managed
{
/* Code removed */
L_000a: call string [mscorlib]System.String::Concat(string, string)
L_000f: stloc.0
L_0013: ret
}

The result selector method aggregatedResult =>string.Format("Final result : {0}", aggregatedResult.ToUpper()) is compiled as:

.method private hidebysig static string <Main>b__2(string aggregatedResult) cil managed
{
/* Code removed */
L_000b: call string [mscorlib]System.String::Format(string, object)
L_0014: ret
}

Quantifier-Based Methods

This section will explore in detail the All, Any, and Contains extension methods.

All

The All extension method determines whether all of the elements of a sequence satisfy a condition, if every element of the source sequence passes the condition in the specified predicate, or if the sequence is empty and returns true; otherwise, false. The signature of this extension method is:

public static bool All<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)

Listing 12-43 shows how to find those items from the sequence that have at least three characters.

Listing 12-43. Example of the All Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<string> numbers = new List<string>()
{
"One", "Two", "Three", "Four", "Five", "Six", "Seven"
};

if (numbers.All<string>(x => x.Length >= 3))
Console.WriteLine(
"All numbers have at least three characters.");
}
}
}

The program in Listing 12-43 will produce the output:

All numbers have at least three characters.

Internal Operation of the All Method of Execution

The All method will match the specified filtering condition to find items from the sequence. Figure 12-21 demonstrates the details of the All method in Linq.

images

Figure 12-21. Working details of the All method

From Figure 12-21 you can see that the CLR passes the numbers list as input to the All method along with the instance of the MulticastDelegate class instantiated using the anonymous method (x=>x.Length >= 3). In the All method, CLR processes the list to find whether each of the items satisfies the condition provided as a predicate.

Let’s analyze the code in Listing 12-43 to understand what’s happening while the CLR uses the All method over a list.

1. Step 1: The CLR uses the method <Main>b__1 (x => x.Length >= 3) to instantiate the instance of the MulticastDelegate. The CLR passes the original list and the instance of the MulticastDelegate class created in this step as input to the extension method All.

2. Step 2: The All method will loop through the list and try to determine whether any element in the sequence does not meet the condition and it then returns false, otherwise a true value, as a result of the operation. The implementation of the All method is:

public static bool All<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
foreach (TSource local in source)
if (!predicate(local))
return false;
return true;
}

Any

The Any method determines whether any element of a sequence exists or satisfies a condition provided as a predicate. The method signatures for this extension method are:

public static bool Any<TSource>(this IEnumerable<TSource> source)
public static bool Any<TSource>(this IEnumerable<TSource> source,
Func<TSource, bool> predicate)

The first version of the Any extension method will determine whether or not the sequence of items contains any element in it. The second version of the Any extension method will determine if there is any element in the sequence that matches the criteria provided in the predicate.

Listing 12-44 demonstrates the use of the Any method.

Listing 12-44. Example of the Any Extension Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<string> numbers = new List<string>()
{
"One", "Two", "Three", "Four", "Five", "Six", "Seven"
};

if (numbers.Any<string>())
Console.WriteLine("Contains");

if (numbers.Any<string>(x => x.Length >= 3))
Console.WriteLine("Contains");
}
}
}

This program produced the output:

Contains
Contains

Internal Operation of the Any Method of Execution

When the CLR executes the first version of the Any method, it performs the following steps:

1. Step 1: The CLR will send the original sequence or the list, in this case numbers, to the Any<TSource>(this IEnumerable<TSource> source) method as input.

2. Step 2: This method loops through the list via the Enumerator object returned from the list of numbers and checks whether the enumerator returned a true value while calling the MoveNext method of it and returns true, otherwise false (i.e., the sequence does not have any element in it). The implementation of the Any method is shown in Listing 12-45.

Listing 12-45. Example of the Any Extension Method

public static bool Any<TSource>(this IEnumerable<TSource> source)
{
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
if (enumerator.MoveNext())
return true;
}
return false;
}

The overloaded version of the Any method will execute as demonstrated in Figure 12-22.

images

Figure 12-22. Example of the Any statement

The CLR follows these steps in executing the Any method:

1. Step 1: The CLR instantiates an instance of MulticastDelegate using method <Main>b__1 (constructed using the anonymous method (x => x.Length >= 3) in the compile time) as the predicate along with the original list to the Any extension method.

2. Step 2: The CLR loops through the list to execute the predicate using each item as input to the predicate. The predicate returns true while iterating the list, otherwise it will continue until it finds a match. The implementation of the Any method is shown in Listing 12-46.

Listing 12-46. Implementation of the Any Method

public static bool Any<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
foreach (TSource local in source)
{
if (predicate(local))
return true;
}
return false;
}

Contains

The Contains method determines whether or not a sequence contains a specified element in it, for example, if a sequence contains 1,2,3,4 and you use Contains(4), it will then return the availability of 4 in the list. This method determines whether a sequence contains a specified element by using the default equality comparer, and it determines whether a sequence contains a specified element by using a specified IEqualityComparer<T>.

The method signature for this extension method is:

public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value)
public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value,
IEqualityComparer<TSource> comparer)

Listing 12-47 demonstrates the use of the Contains method.

Listing 12-47. Example of Contains Method Over the List<int> Type

using System;
using System.Collections;
using System.Collections.Generic;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> listOne = new List<int>()
{
1,2,3,4,5
};

var resultAsTrue = listOne.Contains(2);
var resultAsFalse = listOne.Contains(200);
Console.WriteLine("{0}\n{1}", resultAsTrue, resultAsFalse);
}
}
}

This program will produce the output:

True
False

Internal Operation of the Contains Method of Execution

While executing the Contains method, the CLR will search a particular item from the list. This search will have two directions: if the input is a null value, it will then loop through the list to match the item with the null from the list and return true if one of the items from the list is null; otherwise, it will return false. Other than null value, the CLR will compare the value (provided to match as input) with each of the items from the list, and depending on a match, it will return a Boolean answer. The implementation of the Contains method is shown in Listing 12-48.

Listing 12-48. Implementation of the Contains Method

public bool Contains(T item)
{
/* First way of search */
if (item == null)
{
for (int j = 0; j < this._size; j++)
{
if (this._items[j] == null)
return true;
}
return false;
}

/* Second way of search when the First does not execute */
EqualityComparer<T> comparer = EqualityComparer<T>.Default;
for (int i = 0; i < this._size; i++)
{
if (comparer.Equals(this._items[i], item))
return true;
}
return false;
}

Element-Based Methods

This section will examine the different element-based extension methods, such as First, FirstOrDefault, Last, LastOrDefault, Single, SingleOrDefault, ElementAt, ElementAtOrDefault, and DefaultIfEmpty.

First

The First method returns the first element of a sequence. The method signatures of this extension are:

public static TSource First<TSource>(this IEnumerable<TSource> source) public static TSource First<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)

The first version of this extension method finds the first item from the sequence of items. The second version of this extension method finds the first item of a list that meets the predicate condition.

Listing 12-49 demonstrates the use of the First method.

Listing 12-49. Example of the First Extension Methods

using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> numbers = new List<int>()
{
1,2,3,4,5,6,7
};

var firstItem = numbers.First();
var firstItemBasedOnConditions = numbers.First(item => item > 3);
Console.WriteLine("{0}\n{1}",
firstItem,
firstItemBasedOnConditions);
}
}
}

This program produces the output:

1
4

Internal Operation of the First Method of Execution

When the CLR executes the first version of the First method, it follows these steps to execute the operation:

1. Step 1: The CLR sends the original list to the First <TSource>(this IEnumerable<TSource> source) method as an input parameter.

2. Step 2: This method will return the first item from the original list or iterate through the original list and return the first item from the iteration as a result.

The implementation of the First method is shown in Listing 12-50.

Listing 12-50. Implementation of the First Method

public static TSource First<TSource>(this IEnumerable<TSource> source)
{
IList<TSource> list = source as IList<TSource>;
if (list != null)
{
if (list.Count > 0)
return list[0];
}
else
{
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
if (enumerator.MoveNext())
return enumerator.Current;
}
}
}

To execute the overloaded method of the First method, which takes a predicate to filter the list, the CLR will instantiate an instance of the MulticastDelegate using the method <Main>b__1 (item => item > 3). It will then loop through the list and match with each element in the sequence based on the predicate. This returns on the first match, otherwise it will continue until it finds a match.

The implementation of the First method would be:

public static TSource First<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
foreach (TSource local in source)
{
if (predicate(local))
return local;
}
}

FirstOrDefault

The FirstOrDefault method returns the first element of a sequence or a default value if no element is found. The signature for the FirstOrDefault method is:

public static TSource FirstOrDefault<TSource>(this IEnumerable<TSource> source) public static TSource FirstOrDefault<TSource>(
this IEnumerable<TSource> source, Func<TSource, bool> predicate)

These two methods will return the first element of a sequence or a default value if the sequence contains no elements. The default value will be the value of the generic type for which the list will be instantiated. For example, if the list is the type int, the default value will be zero if there is no element in it. It returns the first element of the sequence that satisfies a condition or a default value if no such element is found.

Listing 12-51 presents an example to demonstrate the use of the FirstOrDefault method.

Listing 12-51. Example of FirstOrDefault Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> firstNumbers = new List<int>();

IList<int> secondNumbers = new List<int>()
{
1,2,3,4,5,6,7
};

var firstItemOfFirstList = firstNumbers.FirstOrDefault();
var firstItemIfFirstListBasedOnConditions =
firstNumbers.FirstOrDefault(item => item > 3);

var firstItemOfSecondList = secondNumbers.FirstOrDefault();
var firstItemOfSecondListBasedOnConditions =
secondNumbers.FirstOrDefault(item => item > 3);
Console.Write(string.Format("{0}\t{1}\t{2}\t{3}",
firstItemOfFirstList,
firstItemIfFirstListBasedOnConditions,
firstItemOfSecondList,
firstItemOfSecondListBasedOnConditions
));
}
}
}

The program in Listing 12-51 produced the output:

0 0 1 4

Internal Operation of the FirstOrDefault Method of Execution

The implementation of the first overloaded FirstOrDefault method is shown in Listing 12-52.

Listing 12-52. Implementation of the First FirstOrDefault Method

public static TSource FirstOrDefault<TSource>(this IEnumerable<TSource> source)
{

IList<TSource> list = source as IList<TSource>;
if (list != null)
{
if (list.Count > 0)
return list[0];
}
else
{
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
if (enumerator.MoveNext())
return enumerator.Current;
}
}
return default(TSource);
}

The implementation of the second overloaded FirstOrDefault method is shown in Listing 12-53.

Listing 12-53. Implementation of the Second FirstOrDefault Method

public static TSource FirstOrDefault<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
foreach (TSource local in source)
{
if (predicate(local))
return local;
}
return default(TSource);
}

Last

The Last method is used to find the last element of a sequence. The method signature for this extension method is:

public static TSource Last<TSource>(this IEnumerable<TSource> source)
public static TSource Last<TSource>(this IEnumerable<TSource> source, Func<TSource, bool>
predicate)

The first version of this method will find the last item from the sequence of items. The second version of this method will find the last item from a list that meets the predicate condition.

The program in Listing 12-54 demonstrates the use of the Last extension method.

Listing 12-54. Example of the Last Extension Methods

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> numbers = new List<int>()
{
1,2,3,4,5,6,7
};

var lastItem = numbers.Last();
Console.WriteLine(lastItem);
var lastItemBasedOnConditions = numbers.Last(item => item > 3);
Console.WriteLine(lastItemBasedOnConditions);
}
}
}

The code produced the output:

7
7

Internal Operation of the Last Method of Execution

When the CLR executes the first overloaded Last method, the original list will be passed to the Last <TSource> (this IEnumerable<TSource> source) method as input. This method will iterate through the list via the Enumerator object it gets from the original list, keeping the current item from the Enumerator and checking whether the enumerator returns a true value while calling the MoveNext. As long as this MoveNext returns, the iteration will continue and the current item from the enumerator will hold locally. If the MoveNext returns false, the iteration will end and the current value will be return as the last item of the list. The implementation of this constructor is shown in Listing 12-55.

Listing 12-55. Implementation of the Last Method

public static TSource Last<TSource>(this IEnumerable<TSource> source)
{
IList<TSource> list = source as IList<TSource>;
if (list != null)
{
int count = list.Count;
if (count > 0)
return list[count - 1];
}
else
{
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
if (enumerator.MoveNext())
{
TSource current;
do
{
/* Hold the Current item from the original list */
current = enumerator.Current;
}while (enumerator.MoveNext());
return current;
}
}
}
throw Error.NoElements();
}

To execute the overloaded version of the Last method, as used in Listing 12-55, the C# compiler will construct a method <Main>b__1 using the anonymous method (item => item > 3) in the compile time and the CLR will instantiate an instance of the MulticastDelegate using the <Main>b__1. The CLR passes this instance to the Last method. The CLR will loop through the list and match with each element in the sequence based on the condition provided in the predicate. This will return the last element that satisfies the condition, otherwise the default value of the type is provided as a generic type, as shown in the implementation of the Last method in Listing 12-56.

Listing 12-56. Implementation of the Last Method

public static TSource Last<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
TSource local = default(TSource);
bool flag = false;
foreach (TSource local2 in source)
{
if (predicate(local2))
local = local2;
flag = true;
}
return local;
}

LastOrDefault

The LastOrDefault method returns the last element of a sequence or a default value if no element is found. The method signature for this extension method is:

public static TSource LastOrDefault<TSource>(this IEnumerable<TSource> source)
public static TSource LastOrDefault<TSource>(
this IEnumerable<TSource> source, Func<TSource, bool> predicate)

Listing 12-57 demonstrates the use of the LastOrDefault method.

Listing 12-57. Example of the LastOrDefault Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> firstNumbers = new List<int>();

IList<int> secondNumbers = new List<int>()
{
1,2,3,4,5,6,7
};

var lastItemOfFirstList = firstNumbers.LastOrDefault();
var lastItemIfFirstListBasedOnConditions =
firstNumbers.LastOrDefault(item => item > 3);

var lastItemOfSecondList = secondNumbers.LastOrDefault();
var lastItemOfSecondListBasedOnConditions =
secondNumbers.LastOrDefault(item => item > 3);

Console.Write(string.Format("{0}\t{1}\t{2}\t{3}",
lastItemOfFirstList,
lastItemIfFirstListBasedOnConditions,
lastItemOfSecondList,
lastItemOfSecondListBasedOnConditions
));
}
}
}

This program will produce the output:

0 0 7 7

Internal Operation of the LastOrDefault Method of Execution

The implementation code for the first version of the LastOrDefault method is:

public static TSource LastOrDefault<TSource>(this IEnumerable<TSource> source)
{
IList<TSource> list = source as IList<TSource>;
if (list != null)
{
int count = list.Count;
if (count > 0)
return list[count - 1];
}
else
{
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
if (enumerator.MoveNext())
{
TSource current;
do
{
current = enumerator.Current;
}
while (enumerator.MoveNext());
return current;
}
}
}
return default(TSource);
}

The implementation of the second version of the LastOrDefault extension method is:

public static TSource LastOrDefault<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
TSource local = default(TSource);
foreach (TSource local2 in source)
{
if (predicate(local2))
local = local2;
}
return local;
}

Single and SingleOrDefault

The Single method returns a single, specific element of a sequence. The method signature for the extension method is:

public static TSource Single<TSource>(this IEnumerable<TSource> source) public static TSource Single<TSource>(this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
public static TSource SingleOrDefault<TSource>(this IEnumerable<TSource>
source)
public static TSource SingleOrDefault<TSource>(
this IEnumerable<TSource> source, Func<TSource, bool> predicate)

This method returns the only element of a sequence and throws an exception if there is not exactly one element in the sequence. It returns the only element of a sequence that satisfies a specified condition and throws an exception if more than one such element exists.

Listing 12-58 presents a program that uses the Single extension method. In this program, the Single method works over the numbers that contains only an item and returns One as the output.

Listing 12-58. Use of the Single Extension Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<string> numbers = new List<string>
{
"One"
};
var result = numbers.Single();
Console.WriteLine("{0}", result);
}
}
}

The program in Listing 12-58 will produce the output:

One

Internal Operation of the Single and SingleOrDefault Methods of Execution

If you modify the code in Listing 12-58 and add one more item in the numbers list, the CLR will then throw an exception:

Unhandled Exception: System.InvalidOperationException: Sequence contains more than one element at System.Linq.Enumerable.Single[TSource](IEnumerable'1 source)
at Ch12.Program.Main(String[] args) in J:\Book\ExpertC#2012\SourceCode\BookExamples\Ch12\Program.cs:line 16

Let’s find out how this works behind the scenes. While executing the Single method, the CLR will make a new List<string> using a copy of the original list and check whether or not the new list is null. If it is not null, then it will check the number of items in the list. If the number of items in the list is zero, then the CLR will throw an exception, otherwise if it is one, it will return the first and only item from the list. The implementation of the Single method is shown in Listing 12-59.

Listing 12-59. Implementation of the Single Method

public static TSource Single<TSource>(this IEnumerable<TSource> source)
{
IList<TSource> list = source as IList<TSource>;
if (list != null)
{
switch (list.Count)
{
case 0:
throw Error.NoElements();

case 1:
return list[0];
}
}
else
{
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
TSource current = enumerator.Current;
if (!enumerator.MoveNext())
return current;
}
}
throw Error.MoreThanOneElement();
}

Let’s try the Single extension method with a predicate function, as demonstrated in Listing 12-60.

Listing 12-60. Usage of the Single Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<string> numbers = new List<string>
{
"One","Four"
};
var result = numbers.Single(x => x.Length > 3);
Console.WriteLine("{0}", result);
}
}
}

This program produces the output:

Four

Let’s analyze Listing 12-60 to understand what’s happening while we are executing the Single method, as demonstrated in Figure 12-23.

images

Figure 12-23. How the Single method works

If you modify the program in Listing 12-60 by adding one more items whose length is more than three characters, the program will fail by throwing an exception.

SingleOrDefault works much the same way as the Single extension method, but instead of throwing an exception when there is no item, it returns the default value of the type defined in the method call as generic type. Let’s look at the example in Listing 12-61 of the use of the SingleOrDefaultmethod.

Listing 12-61. Example of SingleOrDefault Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<string> listStringWithoutItem = new List<string>();
IList<string> listStringWithItem = new List<string>() { "One" };
IList<int> listInt = new List<int>();
IList<char> listChar = new List<char>();
IList<long> listLong = new List<long>();
IList<double> listDouble = new List<double>();

Console.WriteLine("string : {0}",
listStringWithoutItem.SingleOrDefault());
Console.WriteLine("string : {0}",
listStringWithItem.SingleOrDefault());
Console.WriteLine("int : {0}", listInt.SingleOrDefault());
Console.WriteLine("char : {0}", listChar.SingleOrDefault());
Console.WriteLine("long : {0}", listLong.SingleOrDefault());
Console.WriteLine("double : {0}", listDouble.SingleOrDefault());
}
}
}

The program in Listing 12-61 will produce the output:

string :
string : One
int : 0
char :
long : 0
double : 0

Let’s analyze the code in Listing 12-61 to understand what’s happening when the CLR executes the SingleOrDefault method.

The CLR makes a copy of the original list in a temporary list. It then checks the number of items in the list, if it is zero then the CLR will return the default value of the provided type, for example, the string listStringWithoutItem.SingleOrDefault<string>(), or inferred type from the list (for example, listStringWithoutItem is a type of IList<string> so the inferred type will be string). The implementation of the SingleOrDefault method is demonstrated in Listing 12-62.

Listing 12-62. Implementation of SingleOrDefault Method

public static TSource SingleOrDefault<TSource>(this IEnumerable<TSource> source)
{
IList<TSource> list = source as IList<TSource>;
if (list != null)
{
switch (list.Count)
{
case 0:
return default(TSource);

case 1:
return list[0];
}
}
else
{
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
if (!enumerator.MoveNext())
return default(TSource);

TSource current = enumerator.Current;

if (!enumerator.MoveNext())
return current;
}
}
throw Error.MoreThanOneElement();
}

ElementAt

The ElementAt extension method returns the element at a specified index in a sequence. The method signature for this extension method is:

public static TSource ElementAt<TSource>(this IEnumerable<TSource> source, int index)

Listing 12-63 presents the program for use of the ElementAt method.

Listing 12-63. Example of the ElementAt Method

using System;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<string> numbers = new List<string>()
{
"One","Two","Three"
};

var elementAt = numbers.ElementAt(1);

Console.WriteLine(elementAt);
}
}
}

This program produces the output:

Two

Internal Operation of the ElementAt Method of Execution

Let’s analyze the code in Listing 12-63 carefully to understand what’s happening when we execute the ElementAt method.

1. Step 1: The CLR calls the get_Item method from the System.Collections.Generic.List'1<T> while executing the ElementAt<TSource>() extension method. Listing 12-64 has been extracted from the ElementAt method of the System.Linq.Enumerable class of the mscorlib.dll assembly using the ildasm.exe, as shown in Listing 12-64.

Listing 12-64. Example of the ElementAt Method

.method public hidebysig static !!TSource ElementAt<TSource>(
class [mscorlib]System.Collections.Generic.IEnumerable'1<!!TSource> source,
int32 index) cil managed
{
IL_0018: ldloc.0
/* Load the argument which is used as the index to retrieve item from
* the array */
IL_0019: ldarg.1
IL_001a: callvirt instance !0 class
[mscorlib]System.Collections.Generic.IList'1<!!TSource>::
get_Item(int32)
IL_001f: ret
} // end of method Enumerable::ElementAt

From this code you can see that the CLR calls the get_Item(int32) method of the System.Collections.Generic.List'1<T> class with the index (input of the ElementAt method).

2. Step 2: The get_Item(int32) method will load the _items array of this class(label IL_000f in the following IL code) and will load the argument(label IL_0014 in the following IL code) to get the index, which will later be used to access the item from the _items array based on the index. The IL code in Listing 12-65 of the get_Item(int32) method is extracted from the System.Collections.Generic.List'1<T> class from the mscorlib.dll using ildasm.exe.

Listing 12-65. Example of the ElementAt Method

.method public hidebysig newslot specialname virtual final
instance !T get_Item(int32 index) cil managed
{
IL_000e: ldarg.0
/* It will load the _items array */
IL_000f: ldfld !0[] class System.Collections.Generic.List'1<!T>::_items
IL_0014: ldarg.1

/* CLR will load an item based on the index value provided into !T */
IL_0015: ldelem !T
IL_001a: ret
} // end of method List'1::get_Item

In Listing 12-65, the ldfld IL instruction used in the IL_000f will load the _items field of the List<T> class, and on the IL_0014 label it will load argument 1, which is the index of the array that will be used to access the item from the _items array using the IL instruction ldelem used in the IL_0015.

images ldfld filedName: It will push the value of the field specified in the specified to the method’s (get_item) stack.

images ldelem indexPosition: It will load an element from the position specified in indexPosition from the array.

ElementAtOrDefault

The ElementAtOrDefault method returns the element at a specified index in a sequence or a default value if the index is out of range. The method signature is:

public static TSource ElementAtOrDefault<TSource>(this IEnumerable<TSource> source, int index)

Listing 12-66 presents an example of the ElementAtOrDefault method.

Listing 12-66. Example of the ElementAtOrDefault Method

using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
namespace Ch12
{
public struct MyStruct
{
public string Name { get; set; }
}
public class Person
{
public string PersonID { get; set; }
public int PersonAge { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<string> series = new List<string> { "One", "Two", "Three" };
List<MyStruct> names = new List<MyStruct>
{
new MyStruct{ Name="A"},
new MyStruct{ Name="B"},
};
List<Person> persons = new List<Person>
{
new Person { PersonID = "PA_01", PersonAge = 6 },
new Person { PersonID = "PB_01", PersonAge = 7 },
};
// output will be null
var item = series.ElementAtOrDefault(8);
// Output contain an instnce of MyStruct in where Name will be null
var name = names.ElementAtOrDefault(8);
//Output will be null
var person = persons.ElementAtOrDefault(8);
}
}
}

Internal Operation of the ElementAtOrDefault Method of Execution

For the ElementAtOrDefault method, the CLR will pass the original list and the index into the ElementAtOrDefault method. It will then find the item from the specified index or otherwise return the default values, for example, the default value for reference and nullable types is null, or the default value for the value type (see Chapter 1).

DefaultIfEmpty

The DefaultIfEmpty method returns the elements of an IEnumerable<T> or a default valued singleton collection if the sequence is empty. The signature for this extension method is:

public static IEnumerable<TSource> DefaultIfEmpty<TSource>(this IEnumerable<TSource> source) public static IEnumerable<TSource> DefaultIfEmpty<TSource>(
this IEnumerable<TSource> source, TSource defaultValue)

The first version returns the elements of the specified sequence or the type parameter’s default value in a singleton collection if the sequence is empty. The second version returns the elements of the specified sequence or the specified value in a singleton collection if the sequence is empty.

This method can be used on a list that does not have items in it. If we call the extension method over this list, it will return the default value of the item. Listing 12-67 presents a program using the DefaultIfEmpty method.

Listing 12-67. Use of the DefaultIfEmpty Method

using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{

IList<Person> persons = new List<Person>();
IList<int> numbers = new List<int>();
IList<string> names = new List<string>();
/* Output: A list with 1 item with the null value */
var defaultPersons = persons.DefaultIfEmpty();
/*Output: A list with 1 item with the 0 value */
var defaultNumbers = numbers.DefaultIfEmpty().ToList();
/* Output: A list with 1 item with the null value */
var defaultNames = names.DefaultIfEmpty();
}
}

class Person
{
public string Name
{
get;
set;
}

public string Address
{
get;
set;
}

public int Age
{
get;
set;
}
}
}

Internal Operation of the DefaultIfEmpty Method of Execution

Listing 12-67 has three lists, such as persons, numbers, and names, of type Person, int, and string, respectively. These three lists do not have any items, as the Count Property of this list returns zero. When the DefaultIfEmpty method is called over any of this list, the CLR will then:

· Copy the list to this DefaultIfEmpty method. From this method the CLR will return the instance of the DefaultIfEmptyIterator<TSource> iterator, which will hold the default value and source value for the related type. The defaultvalue property will contain the default value of the type of list it is processing, and the source will be the original list.

· Pass the DefaultIfEmptyIterator to the ToList method, which will call the List class passing the object of the DefaultIfEmptyIterator as input. In this class, CLR will iterate through the original list and process the result.

The implementation of the DefaultIfEmptyIterator is shown in Listing 12-68.

Listing 12-68. Implementation of the DefaultIfEmptyIterator Method

private static IEnumerable<TSource> DefaultIfEmptyIterator<TSource>(
IEnumerable<TSource> source,
TSource defaultValue)
{
using (IEnumerator<TSource> iteratorVariable0 = source.GetEnumerator())
{
if (iteratorVariable0.MoveNext())
do
{
yield return iteratorVariable0.Current;
}
while (iteratorVariable0.MoveNext());
else
yield return defaultValue;
}
}

From Listing 12-68, you can see that the iterator is not be able to iterate through, and as a result, the CLR will return the value of defaultValue of the DefaultIfEmptyIterator. In this circumstance, the defaultValue of the DefaultIfEmptyIterator for the persons.DefaultIfEmpty() will hold null (because persons is a type IList<Person>), zero for the numbers (because it is a type of IList<int>), and null for the names (because it is a type of the IList<string>).

Generation-Based Methods

This section will examine the Empty, Range, and Repeat extension methods.

Empty

The Empty method returns an empty IEnumerable<T>. The method signature for this extension is:

public static IEnumerable<TResult> Empty<TResult>()

Listing 12-69 shows the use of the Empty method.

Listing 12-69. Example of the Empty Method

using System;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
var emptyList = Enumerable.Empty<int>();

Console.WriteLine(emptyList.Count());
}
}
}

The program in Listing 12-69 produces the output:

0

Internal Operation of the Empty Method of Execution

When the CLR executes the Empty method, as shown in Listing 12-69, it creates an empty list of int. Let’s analyze the code in the Listing 12-69 to help us understand what’s happening as we execute this program.

1. Step 1: The CLR calls the get_Instance method of the EmptyEnumerable'1<!!TResult> (an internal class from the System.Linq namespace of the System.Core.dll assembly) while executing the Empty<TResult> method. This class has an array field of given type (!!TResult) and a property instance that returns the array field. The IL code in Listing 12-70 of the Empty method (decompiled from the System.Core.dll assembly) demonstrates how the get_Instance method is being called.

Listing 12-70. Example of the Empty Method

.method public hidebysig static class
[mscorlib]System.Collections.Generic.IEnumerable'1<!!TResult>
Empty<TResult>() cil managed
{
IL_0000: call class [mscorlib]System.Collections.Generic.IEnumerable'1<!0>
class System.Linq.EmptyEnumerable'1<!!TResult>::get_Instance()
IL_0005: ret
} // end of method Enumerable::Empty

2. Step 2: The instance property from the System.Linq.EmptyEnumerable'1<!!TResult> will call the get_Instance method. The get_Instance method will create an array with zero items. The CLR will first push zero onto the stack as int32 type using the ldc.i4.0 IL instruction from the label IL_0007 as shown in Listing 12-71. Using the newarr IL instruction, the CLR will create a new array with the zero item and will push it onto the Stack. In the label IL_000d, the CLR will use stsfld to replace the value of the field value (i.e., instance field’s value using the value from the Stack).

Listing 12-71. Internal Operation of the get_Instance Method

.method public hidebysig specialname static
class [mscorlib]System.Collections.Generic.IEnumerable'1<!TElement>
get_Instance() cil managed
{
// code size 24 (0x18)
.maxstack 8
IL_0000: ldsfld !0[] class
System.Linq.EmptyEnumerable'1<!TElement>::'instance'
IL_0005: brtrue.s IL_0012
IL_0007: ldc.i4.0
IL_0008: newarr !TElement
IL_000d: stsfld !0[] class
System.Linq.EmptyEnumerable'1<!TElement>::'instance'
IL_0012: ldsfld !0[] class
System.Linq.EmptyEnumerable'1<!TElement>::'instance'
IL_0017: ret
} // end of method EmptyEnumerable'1::get_Instance

Range

The Range method generates a sequence of integral numbers within a specified range, implemented by using deferred execution. The immediate return value is an instance of the relevant iterator instance that stores all the information required to perform the action. The method signature for this method is:

public static IEnumerable<int> Range(int start, int count)

This method will create a list of int items based on the start number until the number of times is defined in the count. Listing 12-72 demonstrates the use of the Range method.

Listing 12-72. Example of the Range Method

using System;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
Enumerable.Range(1, 10).ToList().ForEach(x =>
Console.Write("{0}\t", x));
}
}
}

The program in Listing 12-72 will produce the output:

1 2 3 4 5 6 7 8 9 10

Internal Operation of the Range Method of Execution

Figure 12-24 demonstrates how the CLR executes the Range method.

images

Figure 12-24. How the Range method works

The CLR follows these steps to execute the Range operation:

1. Step 1: The CLR passes the start element and the length of the generated sequence to the Range method as input. The code in Listing 12-73 demonstrates that the CLR will return the RangeIterator<int>, which will hold all the related information such as the start element and length of the sequence inside it as a return.

Listing 12-73. Implementation of the Range Method

public static IEnumerable<int> Range(int start, int count)
{
long num = (start + count) - 1L;
return RangeIterator(start, count);
}

The RangeIterator<int> will not be executed (due to the deferred execution) until the CLR calls the ToList method.

2. Step 2: The CLR will pass this RangeIterator<int> to the ToList method to the List class and process the operation based on the iteration logic implemented in the RangeIterator<int> class and produce the ranged sequence as output. The implementation of the RangeIterator<int> is shown inListing 12-74.

Listing 12-74. Implementation of the RangeIterator Method

private static IEnumerable<int> RangeIterator(int start, int count)
{
int iteratorVariable0 = 0;
while (true)
{
if (iteratorVariable0 >= count)
yield break;

yield return (start + iteratorVariable0);
iteratorVariable0++;
}
}

Repeat

The Repeat method generates a sequence that contains one repeated value as implemented by using deferred execution. The immediate return value is an object of the relevant iterator type that stores all the information that is required to perform the action. The method signature of this extension method is:

public static IEnumerable<TResult> Repeat<TResult>(TResult element, int count)

It will generate a sequence of a number defined by the TResult type of a number of times measured by the count, as shown in Listing 12-75.

Listing 12-75. Example of the Repeat Method

using System;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
Enumerable.Repeat(1, 5).ToList().
ForEach(x=>Console.Write("{0}\t",x));
}
}
}

The Repeat method of the Enumerable class will generate a sequence of 1 five times. It will produce the output:

1 1 1 1 1

Internal Operation of the Repeat Method of Execution

The CLR will follow the following steps in executing the Repeat method:

1. Step 1: The CLR will pass the element to repeat and number of times to repeat the element to the Repeat method as input. Inside the Repeat method, it will construct the RepeatIterator<TResult> iterator, which will hold the related information to generate the sequence.

2. Step 2: The CLR will pass this RepeatIterator<TResult> instance to the ToList method, which will pass this iterator to the List class and process the operation based on the iteration logic implemented in the RepeatIterator<TResult> and produce the repeated sequence as output. The implementation of the Repeat method is:

private static IEnumerable<TResult> RepeatIterator<TResult>(TResult element, int count)
{
int iteratorVariable0 = 0;
while (true)
{
if (iteratorVariable0 >= count)
yield break;

yield return element;
iteratorVariable0++;
}
}

Conversion-Based Methods

This section will examine the different conversion-based extension methods, such as Cast, ToArray, OfType, ToDictionary, ToList, and ToLookup.

Cast

The Cast method casts the elements of a sequence to the specified type. This method is implemented by using deferred execution. Listing 12-76 presents an example of the Cast method.

Listing 12-76. Example of the Cast Method

using System;
using System.Collections;
using System.Linq;
namespace Ch12
{
class Program
{
static void Main(string[] args)
{
ArrayList numbers = new ArrayList();
numbers.Add("One");
numbers.Add("Two");
numbers.Add("Three");
numbers.Cast<string>().Select(number => number).ToList().ForEach(
number => Console.Write("{0}\t", number));
}
}
}

The program will produce the output:

One Two Three

Internal Operation of the Cast Method of Execution

Let’s analyze the code in the Listing 12-76 to understand what’s happening, when the CLR executes the Cast method. Figure 12-25 demonstrates that the numbers will be passed as input to the Cast method internally.

images

Figure 12-25. Internal operation of the Cast method

The CLR follows several steps in executing the Cast method:

1. Step 1: It encapsulates the original list numbers into the CastIterator and passes this to the Select method.

2. Step 2: The Select method will construct the WhereSelectEnumerableIterator and pass it to the ToList method, and this will iterate through the enumerator and cast the iterated item with the inferred type, for example, for the code in Listing 12-76, it is string. The implementation of theCastIterator is:

private static IEnumerable<TResult> CastIterator<TResult>(IEnumerable source)
{
IEnumerator enumerator = source.GetEnumerator();
while (enumerator.MoveNext())
{
object current = enumerator.Current;
yield return (TResult) current;
}
}

ToArray

The ToArray method will create an array from the given list. The program in Listing 12-77 shows the use of the ToArray method. ToList<TSource> has similar behavior, but it returns a List<T> instead of an array. The method signature is:

public static TSource[] ToArray<TSource>(this IEnumerable<TSource> source)

Listing 12-77. Example of the ToArray Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> firstList = new List<int>()
{
1,2,3,4
};

var result = firstList.ToArray();
result.ToList().ForEach(x =>
Console.Write(string.Format("{0}\t",x)));
}
}
}

This program will produce the output:

1 2 3 4

Internal Operation of the ToArray Method of Execution

Let’s analyze the code in Listing 12-77 to understand what’s happening when the CLR executes the ToArray method.

1. Step 1: The CLR passes the original list as input to the ToArray<TSource> method. Inside the ToArray method it will create an instance of the Buffer<TSource> type by passing the original list object as input.

2. Step 2: The CLR will copy each of the items from the original list to an internal array named items. The implementation of the ToArray method is:

public static TSource[] ToArray<TSource>(this IEnumerable<TSource> source)
{
Buffer<TSource> buffer = new Buffer<TSource>(source);

return buffer.ToArray();
}

The internal struct of the Buffer<TSource> is:

internal struct Buffer<TElement>
{
internal TElement[] items;
internal int count;
internal Buffer(IEnumerable<TElement> source)
{
TElement[] array = null;
int length = 0;
ICollection<TElement> i.= source as ICollection<TElement>;
if (i.!= null)
{
length = i..Count;
if (length > 0)
{
array = new TElement[length];
i..CopyTo(array, 0);
}
}
else
{
foreach (TElement local in source)
{
if (array == null)
{
array = new TElement[4];
}
else if (array.Length == length)
{
TElement[] destinationArray = new TElement[length * 2];
Array.Copy(array, 0, destinationArray, 0, length);
array = destinationArray;
}
array[length] = local;
length++;
}
}
this.items = array;
this.count = length;
}
}

3. Step 3: When the CLR calls the ToArray method, it will return the items.

The implementation of the ToArray method is:

internal TElement[] ToArray()
{
if (this.count == 0)
return new TElement[0];

if (this.items.Length == this.count)
return this.items;

TElement[] destinationArray = new TElement[this.count];
Array.Copy(this.items, 0, destinationArray, 0, this.count);
return destinationArray;
}

From this source code we can see that a copy of the items array is returned as output of the ToArray method.

OfType

The OfType extension method filters the elements of an IEnumerable based on a specified type using the deferred execution. The immediate return value is an object of the iterator class, which stores all the information that is required to perform the action. The signature of this extension method is:

public static IEnumerable<TResult> OfType<TResult>(this IEnumerable source)

Listing 12-78 presents an example of use of the OfType method.

Listing 12-78. Example of the OfType<TResult> Method

using System;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<object> numbers = new List<object>()
{
"One",
"Two",
1,
2,
"Three",
new Person
{
Name="A Person"
}
};

var filteredNumbers = numbers.OfType<string>();

filteredNumbers.ToList().ForEach(x => Console.Write("{0}\t", x));
Console.WriteLine();
}
}

public class Person
{
public string Name { get; set; }
}
}

This program will filter string values from the numbers list as string type used in OfType method. The program will produce the output:

One Two Three

Internal Operation of the OfType Method of Execution

Let’s analyze the code in Listing 12-78 to really understand what’s happening when the CLR executes the OfType method.

1. Step 1: The CLR will pass the sequence, for example, numbers in this example, to the OfType method as input. Inside the OfType method, the CLR will instantiate the OfTypeIterator, which will hold the original sequence inside it. The implementation of the OfType method is:

public static IEnumerable<TResult> OfType<TResult>(this IEnumerable source)
{
return OfTypeIterator<TResult>(source);
}

2. Step 2: The CLR will pass the instance of the OfTypeIterator<TResult> class to the ToList method, which will pass this iterator to the List class and process the operation based on the iteration logic implemented in the OfTypeIterator and produce the ranged sequence as output. The implementation of the RangeIterator is:

private static IEnumerable<TResult> OfTypeIterator<TResult>(IEnumerable source)
{
IEnumerator enumerator = source.GetEnumerator();
while (enumerator.MoveNext())
{
object current = enumerator.Current;
if (current is TResult)
yield return (TResult) current;
}
}

ToDictionary

The ToDictionary method creates a Dictionary<TKey, TValue> from an IEnumerable<T>. If you want to create a dictionary object based on the data in a list, this method will take care of it, but you need to specify a field from the list data as a key. The signature of this method is:

public static Dictionary<TKey, TSource> ToDictionary<TSource, TKey>( this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
public static Dictionary<TKey, TSource> ToDictionary<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
public static Dictionary<TKey, TElement> ToDictionary<TSource, TKey, TElement>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector, Func<TSource, TElement> elementSelector)
public static Dictionary<TKey, TElement> ToDictionary<TSource, TKey, TElement>(
this IEnumerable<TSource> source, Func<TSource, TKey> keySelector,
Func<TSource, TElement> elementSelector, IEqualityComparer<TKey> comparer)

Listing 12-79 presents an example to explain the use of the ToDictionary method.

Listing 12-79. Example of ToDictionary Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<Person> persons = new List<Person>()
{
new Person(){ Name="Person A", Address= "Address of A",
Id= 111111},
new Person(){ Name="Person B", Address= "Address of B",
Id= 111112},
new Person(){ Name="Person C", Address= "Address of C",
Id= 111113},
new Person(){ Name="Person D", Address= "Address of D",
Id= 111114},
};

var result = persons.ToDictionary(person => person.Id);

foreach (KeyValuePair<double, Person> person in result)
{
Console.WriteLine("{0,-15} {1,-20}{2,-20}{3,-20}",
person.Key,
person.Value.Name,
person.Value.Address,
person.Value.Id);
}
}
}


public class Person
{
public string Name
{
get;
set;
}

public string Address
{
get;
set;
}

public double Id
{
get;
set;
}
}
}

The program in Listing 12-79 produces the output:

111111 Person A Address of A 111111
111112 Person B Address of B 111112
111113 Person C Address of C 111113
111114 Person D Address of D 111114

Internal Operation of the ToDictionary Method of Execution

From the code in Listing 12-79 you can see that a list of Person objects was created and stored in persons and then converted persons into a dictionary using the ToDictionary method. ToDictionary takes an anonymous method as input. This anonymous method is actually a key selector that will select the key from the list and set it as the key in the dictionary object. For example, Id from the Person is selected as the key for the result in the dictionary and the value is the person object itself. Interestingly, the Id property will be used as the key for the dictionary and it will also be stored as part of theperson object as it was initialized. Let’s analyze Listing 12-79 to help us understand what’s happening when the CLR executes the ToDictionary method.

1. Step 1: If we open the System.Linq.Enumerable namespace from the System.core.dll assembly using ildasm.exe, you can see that the ToDictionary method internally calls the internal ToDictionary method, which has the signature:

public static Dictionary<TKey, TElement> ToDictionary<TSource, TKey, TElement>(
this IEnumerable<TSource> source, Func<TSource, TKey> keySelector,
Func<TSource, TElement> elementSelector, IEqualityComparer<TKey> comparer)

Before the CLR calls the internal ToDictionary method from the ToDictionary extension method, it will create an element selector function. Although we haven’t provided any element selector in Listing 12-79, the CLR will use the default element selector, which is IdentityFunction<TSource>.Instance, which is an internal class that will be used as an element selector for the ToDictionary method. Figure 12-26 shows this class from the System.Core.dll assembly.

images

Figure 12-26. IdentityFunction in the System.Linq

The implementation of the ToDictionary method is presented in Listing 12-80.

Listing 12-80. Implementation of the ToDictionary Method

public static Dictionary<TKey, TSource> ToDictionary<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector)
{
return source.ToDictionary<TSource, TKey, TSource>(
keySelector, IdentityFunction<TSource>.Instance, null);
}

2. Step 2: When the CLR goes to the above method, it calls the overloaded ToDictionary as shown in Listing 12-81.

Listing 12-81. Implementation of the ToDictionary Method

public static Dictionary<TKey, TElement> ToDictionary<TSource, TKey, TElement>(
this IEnumerable<TSource> source, Func<TSource, TKey> keySelector,
Func<TSource, TElement> elementSelector, IEqualityComparer<TKey> comparer)
{
Dictionary<TKey, TElement> dictionary =
new Dictionary<TKey, TElement>(comparer);

foreach (TSource local in source)
dictionary.Add(keySelector(local), elementSelector(local));
return dictionary;
}

This will instantiate an instance of the Dictionary<TKey, TElement> class. It will then iterate through the original list, and each of the iterate values will pass to the KeySelector and ElementSelector functions to extract the Key and Value from the iterate value. The anonymous method (person => person.Id) will be used as the KeySelector. The C# compiler will generate an anonymous method <Main>b_5 and pass it to the KeySelector, which will return the Id from the person object, and the compiler will compile the ElementSelector (x=>x as in the IdentityFunction<TElement> where x=>x will be converted as<get_Instance>b__0), which will return the value itself (i.e., the person object with Name, Address and Id value inside).

ToList

The ToList method creates a List<T> from an IEnumerable<T>. The method signature for this extension method is:

public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source)

Listing 12-82 provides the working details of the ToList method.

Listing 12-82. Example of the ToList() Extension Method

using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> numbers = new List<int>()
{
1,2,3,4,5,6,7,8,9,10
};

var result = numbers.Where(x => x > 3).ToList();

result.ForEach(x => Console.Write("{0}\t", x));
Console.WriteLine();
}
}
}

This program will produce the output:

4 5 6 7 8 9 10

Internal Operation of the ToList Method of Execution

Let’s analyze the code in Listing 12-82 to really understand what’s happening when the CLR executes the ToList method.

1. Step 1: The ToList method will accept an IEnumerable<TSource> object as input. It will pass this IEnumerable object as input to the List<TSource> type:

public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source)
{
return new List<TSource>(source);
}

Step 2: The List<IEnumerable<TSource>> type will accept an IEnumerable<TSource> collection as input of the constructor. Inside the constructor, the CLR will initialize the _items array with the type as TSource and define the initial size of the array with 4 by default. It will then iterate through the enumerator of the input list as demonstrated in Listing 12-83.

Listing 12-83. Implementation for the List Constructor

public List(IEnumerable<T> collection)
{
ICollection<T> i. = collection as ICollection<T>;
if (i. != null)
{
int count = i..Count;
this._items = new T[count];
i..CopyTo(this._items, 0);
this._size = count;
}
else
{
this._size = 0;
this._items = new T[4];
using (IEnumerator<T> enumerator = collection.GetEnumerator())
{
while (enumerator.MoveNext())
this.Add(enumerator.Current);
}
}
}

3. Step 3: The CLR passes each of the items from the iteration phase in step 2 to the Add(TSource item) method to add them into the _items array initialized in step 2. The implementation of the Add method is:

public void Add(T item) {
if (this._size == this._items.Length)
this.EnsureCapacity(this._size + 1);

this._items[this._size++] = item;
this._version++;
}

In the Add method, the most important code is the line this.EnsureCapacity(this._size + 1). The size of this _items array is dynamic and it will be ensured by the EnsureCapacity method.

4. Step 4: After finishing the iteration, the CLR will return the list object as the output of the ToList method, which will contain elements returned from the given IEnumerable<TSource> object inside the _items array.

ToLookup

The ToLookup method creates a lookup based on the specified key selector. The method signature for this extension method is:

public static ILookup<TKey, TSource> ToLookup<TSource, TKey>(
this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
public static ILookup<TKey, TSource> ToLookup<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
public static ILookup<TKey, TElement> ToLookup<TSource, TKey, TElement>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector, Func<TSource, TElement> elementSelector)
public static ILookup<TKey, TElement> ToLookup<TSource, TKey, TElement>(
this IEnumerable<TSource> source, Func<TSource, TKey> keySelector,
Func<TSource, TElement> elementSelector, IEqualityComparer<TKey> comparer)

The ToLookup method will create a mapping of the Key and Value for a given list. In comparison to the Dictionary, this method will map one to many (i.e., Dictionary maps one key vs. one value), whereas the ToLookup method maps one key vs. many values. For example, if you have a list such as:

/* Before: applying the ToLookup Key and Data combination */
{ A, { A1.1, A1.1 } }
{ A, { A2.2, A2.2 } }
{ A, { A3.3, A3.3 } }

{ B, { B1.1, B1.1 } }

{ C, { C1.1, C1.1 } }
{ C, { C2.1, C2.2 } }

the ToLookup will produce the output:

/* After: applying the ToLookup Key and Data combination */
{ A, { A1.1, A1.1 } }
{ { A2.2, A2.2 } }
{ { A3.3, A3.3 } }

{ B, { B1.1, B1.1 } }

{ C, { C1.1, C1.1 } }

{ { C2.1, C2.2 } }

So in the ToLookup method group, the entire set of items is based on the Key:

· All the data in A category groups under A

· All the data in B category groups under B

· All the data in C category groups under C

Figure 12-27 demonstrates the basics of the ToLookup method.

images

Figure 12-27. Basic operation for the ToLookup method

Listing 12-84 presents an example of the ToLookup method.

Listing 12-84. Example of the ToLookup Method

using System;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
List<Person> persons = CreatePersonList();
var result = persons.ToLookup(
(key) => key.Name,
(groupItem) => groupItem.Address);

result.ToList().ForEach(item =>
{
Console.Write("Key:{0,11}\nValue:\t{1,12}\n",
item.Key,
string.Join("\n\t", item.Select(groupItem =>
groupItem).ToArray()));
});
}

private static List<Person> CreatePersonList()
{
return new List<Person>()
{
new Person{ Name="APerson", Address="APerson's Address"},
new Person{ Name="AAPerson", Address="AAPerson's Address"},
new Person{ Name="APerson",
Address="APerson's Second Address"},
new Person{ Name="BPerson", Address="BPerson's Address"},
new Person{ Name="BBPerson", Address="BBPerson's Address"},
new Person{ Name="BPerson",
Address="BPerson's Second Address"},
new Person{ Name="CPerson", Address="CPerson's Address"},
new Person{ Name="CCPerson", Address="CCPerson's Address"},
new Person{ Name="CPerson",
Address="CPerson's Second Address"},
};
}
}

public class Person
{
public string Name { get; set; }
public string Address { get; set; }
}
}

This program will produce the output:

Key: APerson
Value: APerson's Address
APerson's Second Address
Key: AAPerson
Value: AAPerson's Address
Key: BPerson
Value: BPerson's Address
BPerson's Second Address
Key: BBPerson
Value: BBPerson's Address
Key: CPerson
Value: CPerson's Address
CPerson's Second Address
Key: CCPerson
Value: CCPerson's Address

Internal Operation of the ToLookup Method of Execution

Let’s analyze the code in Listing 12-84 to understand what’s happening when the CLR executes the ToLookup method.

1. Step 1: The C# compiler will construct a method <Main>b__1 using the anonymous method (key) => key.Name and <Main>b__2 using the anonymous method (groupItem) => groupItem.Address. The CLR will instantiate two instances of the MulticastDelegate object using the <Main>b__1 and <Main>b__2methods to pass them into the Lookup class as input to keySelector and elementSelector:

public static ILookup<TKey, TElement> ToLookup<TSource, TKey, TElement>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector, Func<TSource, TElement> elementSelector)
{
return (ILookup<TKey, TElement>) Lookup<TKey, TElement>.Create<TSource>(
source, /* Original list */
keySelector, /* <Main>b__1 */
elementSelector, /* <Main>b__2 */
null); /* Comparer */
}

2. Step 2: Internally the Create method of the Lookup class will instantiate an instance of the Lookup class, which will initialize a data structure groupings (an array with default size 7) with the type Grouping and, as shown in this partial code, of the constructor of the Lookup class:

private Lookup(IEqualityComparer<TKey> comparer)
{
/* Code removed */
this.groupings = new Grouping<TKey, TElement>[7];
}

This groupings array will initially hold default values of the specified type. The CLR will then iterate through the original list and, based on the KeySelector and ElementSelector, it will add iterated items into the groupings array:

foreach (TSource local in source)
{
lookup.GetGrouping(keySelector(local), true).Add(elementSelector(local));
}

On return of the Create method, the CLR will return the instance of the Lookup class, which will hold the lookup table for the original sequence. The implementation code for the Create method in the Lookup class is:

internal static Lookup<TKey, TElement> Create<TSource>(
IEnumerable<TSource> source, Func<TSource, TKey> keySelector,
Func<TSource, TElement> elementSelector, IEqualityComparer<TKey> comparer)
{
Lookup<TKey, TElement> lookup = new Lookup<TKey, TElement>(comparer);
foreach (TSource local in source)
{
lookup.GetGrouping(keySelector(local),
true).Add(elementSelector(local));
}
return lookup;
}

Miscellaneous Methods

This section explores the Zip method.

Zip

The Zip method applies a specified function to the corresponding elements of two sequences, producing a sequence of the results. The method iterate through the two input sequences, applying the function resultSelector to the corresponding elements of the two sequences. The method returns a sequence of the values, which are returned by the resultSelector. If the input sequences do not have the same number of elements, the method combines elements until it reaches the end of one of the sequences. For example, if one sequence has three elements and the other one has four, the result sequence has only three elements. This method is implemented by using deferred execution. The immediate return value is an object that stores all the information required to perform the action.

The Zip extension method will combine two list items based on the provided combination logic. Looking at the method signature provided, we can see it is an extension of the IEnumerable<TFirst> type and accepts IEnumerable<TSecond> second, Func<TFirst, TSecond, TResult> resultSelector items as input:

public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first, IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)

So the items of first and second lists will be combined item by item to produce a new list based on the combination logic provided in the resultSelector Func. So you can see in Figure 12-28 that the Zip method will combine each of the items in the list.

images

Figure 12-28. Basic operation of the Zip method

From Figure 12-28 you can see that the Zip method combined the items in the first list, which contains {1, 2, 3, 4}, with the items in the second list, which contains {"One","Two","Three","Four"} with combination logic: item from the first list + ":\t" + item from the second list. Listing 12-85presents an example of the use of the Zip method.

Listing 12-85. Example of the Zip Method

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Ch12
{
class Program
{
static void Main(string[] args)
{
IList<int> firstList = new List<int>()
{
1,2,3,4
};

IList<string> secondList = new List<string>()
{
"One","Two","Three","Four"
};

var result = firstList.Zip(secondList, (x, y) => x + ":\t" + y);

result.ToList().ForEach(x => Console.WriteLine(x));
}
}
}

This program will produce the output:

1: One
2: Two
3: Three
4: Four

Internal Operation of the Zip Method of Execution

When the C# compiler finds the Zip method, it will follow these steps:

1. Step 1: The C# compiler will construct a method <Main>b__2 using the anonymous method (x, y) => x + ":\t" + y. The CLR will pass this <Main>b__2 method to the MulticastDelegate class to instantiate an instance of it. The C# compiler compiles the (x, y) => x + ":\t" + y code block into the method, as shown in Listing 12-86.

Listing 12-86. <Main>b__2 Method Contents

.method private hidebysig static string '<Main>b__2'(int32 x,
string y) cil managed
{
// code size 22 (0x16)
.maxstack 3
.locals init ([0] string CS$1$0000)
IL_0000: ldarg.0
IL_0001: box [mscorlib]System.Int32
IL_0006: ldstr ":\t"
IL_000b: ldarg.1
IL_000c: call string [mscorlib]System.String::Concat(object,
object,
object)
IL_0011: stloc.0
IL_0012: br.s IL_0014
IL_0014: ldloc.0
IL_0015: ret
} // end of method Program::'<Main>b__2'

2. Step 2: The CLR will pass the instance of the MulticastDelegate created in step 1 to the Zip method, which will return the ZipIterator<TFirst, TSecond, TResult> instance after doing a few basic checks internally. The ZipIterator<TFirst, TSecond, TResult> instance will hold the first and second list andresultSelector (instance of the MulticastDelegate created in step 1) inside it, as illustrated in Figure 12-29.

images

Figure 12-29. Operation details for the Zip method

3. Step 3: Because this Zip extension method will execute using a deferred execution pattern, whenever the CLR executes the ToList method it will iterate through the ZipIterator enumerator. Inside the ZipIterator enumerator the CLR will iterate through each of the lists and get the Currentitem from each list. It will pass that Current item as input to the resultSelector Func as the input.

4. Step 4: The resultSelector will then combine each of the provided items into one single item (for example, 1 from the firstList and One from the secondList will be combined as 1: One) and return it. This will continue until both lists have finished.

In this iteration process, if one of the lists has less items than the other list, this method will then only return the same amount of items from both lists. For example, if list A has {A1, B1, C1, D1} items and B has {A2, B2, C2} items, then the result will be based on combination logic (+) processed, with the result {A1A2, B1B2, C1C2}. The D1 from the A list will be deducted. The implementation of the Zip extension is presented in Listing 12-87.

Listing 12-87. Implementation of the Zip Extension

private static IEnumerable<TResult> ZipIterator<TFirst, TSecond, TResult>(
IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)
{
using (IEnumerator<TFirst> iteratorVariable0 = first.GetEnumerator())
{
using (IEnumerator<TSecond> iteratorVariable1 = second.GetEnumerator())
{
while (iteratorVariable0.MoveNext() &&
iteratorVariable1.MoveNext())
{
yield return resultSelector(
iteratorVariable0.Current,
iteratorVariable1.Current);
}
}
}
}

Summary

In this chapter, we have learned about the Linq in C# with the different extension methods provided from the Enumerable class for Linq. All of the methods discussed in this chapter were delegated based on query syntax. Examination of the internal operations of each of these extension methods has provided a better understanding of these methods and will help you to use these methods more efficiently. The next chapter will discuss exception management.