C# 5.0 in a Nutshell (2012)
Chapter 9. LINQ Operators
This chapter describes each of the LINQ query operators. As well as serving as a reference, two of the sections, “Projection” and Joining, cover a number of conceptual areas:
§ Projecting object hierarchies
§ Joining with Select, SelectMany, Join, and GroupJoin
§ Query expressions with multiple range variables
All of the examples in this chapter assume that a names array is defined as follows:
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" };
Examples that query a database assume that a typed DataContext variable called dataContext is instantiated as follows:
var dataContext = new NutshellContext ("connection string...");
...
public class NutshellContext : DataContext
{
public NutshellContext (string cxString) : base (cxString) {}
public Table<Customer> Customers { get { return GetTable<Customer>(); } }
public Table<Purchase> Purchases { get { return GetTable<Purchase>(); } }
}
[Table] public class Customer
{
[Column(IsPrimaryKey=true)] public int ID;
[Column] public string Name;
[Association (OtherKey="CustomerID")]
public EntitySet<Purchase> Purchases = new EntitySet<Purchase>();
}
[Table] public class Purchase
{
[Column(IsPrimaryKey=true)] public int ID;
[Column] public int? CustomerID;
[Column] public string Description;
[Column] public decimal Price;
[Column] public DateTime Date;
EntityRef<Customer> custRef;
[Association (Storage="custRef",ThisKey="CustomerID",IsForeignKey=true)]
public Customer Customer
{
get { return custRef.Entity; } set { custRef.Entity = value; }
}
}
NOTE
All the examples in this chapter are preloaded into LINQPad, along with a sample database with a matching schema. You can download LINQPad from http://www.linqpad.net.
The entity classes shown are a simplified version of what LINQ to SQL tools typically produce, and do not include code to update the opposing side in a relationship when their entities have been reassigned.
Here are the corresponding SQL table definitions:
create table Customer
(
ID int not null primary key,
Name varchar(30) not null
)
create table Purchase
(
ID int not null primary key,
CustomerID int references Customer (ID),
Description varchar(30) not null,
Price decimal not null
)
NOTE
All examples will also work with Entity Framework, except where otherwise indicated. You can build an Entity Framework ObjectContext from these tables by creating a new Entity Data Model in Visual Studio, and then dragging the tables on to the designer surface.
Overview
In this section, we provide an overview of the standard query operators.
The standard query operators fall into three categories:
§ Sequence in, sequence out (sequence-to-sequence)
§ Sequence in, single element or scalar value out
§ Nothing in, sequence out (generation methods)
We first present each of the three categories and the query operators they include, and then we take up each individual query operator in detail.
Sequence→Sequence
Most query operators fall into this category—accepting one or more sequences as input and emitting a single output sequence. Figure 9-1 illustrates those operators that restructure the shape of the sequences.
Figure 9-1. Shape-changing operators
Filtering
IEnumerable<TSource> →IEnumerable<TSource>
Returns a subset of the original elements.
Where, Take, TakeWhile, Skip, SkipWhile, Distinct
Projecting
IEnumerable<TSource> →IEnumerable<TResult>
Transforms each element with a lambda function. SelectMany flattens nested sequences; Select and SelectMany perform inner joins, left outer joins, cross joins, and non-equi joins with LINQ to SQL and EF.
Select, SelectMany
Joining
IEnumerable<TOuter>, IEnumerable<TInner>→ IEnumerable<TResult>
Meshes elements of one sequence with another. Join and GroupJoin operators are designed to be efficient with local queries and support inner and left outer joins. The Zip operator enumerates two sequences in step, applying a function over each element pair. Rather than naming the type arguments TOuter and TInner, the Zip operator names them TFirst and TSecond:
IEnumerable<TFirst>, IEnumerable<TSecond>→ IEnumerable<TResult>
Join, GroupJoin, Zip
Ordering
IEnumerable<TSource> →IOrderedEnumerable<TSource>
Returns a reordering of a sequence.
OrderBy, ThenBy, Reverse
Grouping
IEnumerable<TSource> →IEnumerable<IGrouping<TSource,TElement>>
Groups a sequence into subsequences.
GroupBy
Set operators
IEnumerable<TSource>, IEnumerable<TSource>→ IEnumerable<TSource>
Takes two same-typed sequences and returns their commonality, sum, or difference.
Concat, Union, Intersect, Except
Conversion methods: Import
IEnumerable→IEnumerable<TResult>
OfType, Cast
Conversion methods: Export
IEnumerable<TSource> →An array, list, dictionary, lookup, or sequence
ToArray, ToList, ToDictionary, ToLookup, AsEnumerable, AsQueryable
Sequence→Element or Value
The following query operators accept an input sequence and emit a single element or value.
Element operators
IEnumerable<TSource> →TSource
Picks a single element from a sequence.
First, FirstOrDefault, Last, LastOrDefault, Single, SingleOrDefault,
ElementAt, ElementAtOrDefault, DefaultIfEmpty
Aggregation methods
IEnumerable<TSource> →scalar
Performs a computation across a sequence, returning a scalar value (typically a number).
Aggregate, Average, Count, LongCount, Sum, Max, Min
Quantifiers
IEnumerable<TSource> →bool
An aggregation returning true or false.
All, Any, Contains, SequenceEqual
Void→Sequence
In the third and final category are query operators that produce an output sequence from scratch.
Generation methods
void→IEnumerable<TResult>
Manufactures a simple sequence.
Empty, Range, Repeat
Filtering
IEnumerable<TSource>→ IEnumerable<TSource>
Method |
Description |
SQL equivalents |
Where |
Returns a subset of elements that satisfy a given condition |
WHERE |
Take |
Returns the first count elements and discards the rest |
WHERE ROW_NUMBER()... or TOP n subquery |
Skip |
Ignores the first count elements and returns the rest |
WHERE ROW_NUMBER()... or NOT IN (SELECT TOP n...) |
TakeWhile |
Emits elements from the input sequence until the predicate is false |
Exception thrown |
SkipWhile |
Ignores elements from the input sequence until the predicate is false, and then emits the rest |
Exception thrown |
Distinct |
Returns a sequence that excludes duplicates |
SELECT DISTINCT... |
NOTE
The “SQL equivalents” column in the reference tables in this chapter do not necessarily correspond to what an IQueryable implementation such as LINQ to SQL will produce. Rather, it indicates what you’d typically use to do the same job if you were writing the SQL query yourself. Where there is no simple translation, the column is left blank. Where there is no translation at all, the column reads “Exception thrown”.
Enumerable implementation code, when shown, excludes checking for null arguments and indexing predicates.
With each of the filtering methods, you always end up with either the same number or fewer elements than you started with. You can never get more! The elements are also identical when they come out; they are not transformed in any way.
Where
Argument |
Type |
Source sequence |
IEnumerable<TSource> |
Predicate |
TSource => bool or (TSource,int) => bool[a] |
[a] Prohibited with LINQ to SQL and Entity Framework |
Query syntax
where bool-expression
Enumerable.Where implementation
The internal implementation of Enumerable.Where, null checking aside, is functionally equivalent to the following:
public static IEnumerable<TSource> Where<TSource>
(this IEnumerable<TSource> source, Func <TSource, bool> predicate)
{
foreach (TSource element in source)
if (predicate (element))
yield return element;
}
Overview
Where returns the elements from the input sequence that satisfy the given predicate. For instance:
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" };
IEnumerable<string> query = names.Where (name => name.EndsWith ("y"));
// Result: { "Harry", "Mary", "Jay" }
In query syntax:
IEnumerable<string> query = from n in names
where n.EndsWith ("y")
select n;
A where clause can appear more than once in a query and be interspersed with let, orderby and join clauses:
from n in names
where n.Length > 3
let u = n.ToUpper()
where u.EndsWith ("Y")
select u; // Result: { "HARRY", "MARY" }
Standard C# scoping rules apply to such queries. In other words, you cannot refer to a variable prior to declaring it with a range variable or a let clause.
Indexed filtering
Where’s predicate optionally accepts a second argument, of type int. This is fed with the position of each element within the input sequence, allowing the predicate to use this information in its filtering decision. For example, the following skips every second element:
IEnumerable<string> query = names.Where ((n, i) => i % 2 == 0);
// Result: { "Tom", "Harry", "Jay" }
An exception is thrown if you use indexed filtering in LINQ to SQL or EF.
SQL LIKE comparisons in LINQ to SQL and EF
The following methods on string translate to SQL’s LIKE operator:
Contains, StartsWith, EndsWith
For instance, c.Name.Contains ("abc") translates to customer.Name LIKE '%abc%' (or more accurately, a parameterized version of this). Contains lets you compare only against a locally evaluated expression; to compare against another column, you must use theSqlMethods.Like method:
... where SqlMethods.Like (c.Description, "%" + c.Name + "%")
SqlMethods.Like also lets you perform more complex comparisons (e.g., LIKE 'abc%def%').
< and > string comparisons in LINQ to SQL and EF
You can perform order comparison on strings with string’s CompareTo method; this maps to SQL’s < and > operators:
dataContext.Purchases.Where (p => p.Description.CompareTo ("C") < 0)
WHERE x IN (..., ..., ...) in LINQ to SQL and EF
With LINQ to SQL and EF, you can apply the Contains operator to a local collection within a filter predicate.
For instance:
string[] chosenOnes = { "Tom", "Jay" };
from c in dataContext.Customers
where chosenOnes.Contains (c.Name)
...
This maps to SQL’s IN operator—in other words:
WHERE customer.Name IN ("Tom", "Jay")
If the local collection is an array of entities or nonscalar types, LINQ to SQL or EF may instead emit an EXISTS clause.
Take and Skip
Argument |
Type |
Source sequence |
IEnumerable<TSource> |
Number of elements to take or skip |
int |
Take emits the first n elements and discards the rest; Skip discards the first n elements and emits the rest. The two methods are useful together when implementing a web page allowing a user to navigate through a large set of matching records. For instance, suppose a user searches a book database for the term “mercury”, and there are 100 matches. The following returns the first 20:
IQueryable<Book> query = dataContext.Books
.Where (b => b.Title.Contains ("mercury"))
.OrderBy (b => b.Title)
.Take (20);
The next query returns books 21 to 40:
IQueryable<Book> query = dataContext.Books
.Where (b => b.Title.Contains ("mercury"))
.OrderBy (b => b.Title)
.Skip (20).Take (20);
LINQ to SQL and EF translate Take and Skip to the ROW_NUMBER function in SQL Server 2005, or a TOP n subquery in earlier versions of SQL Server.
TakeWhile and SkipWhile
Argument |
Type |
Source sequence |
IEnumerable<TSource> |
Predicate |
TSource => bool or (TSource,int) => bool |
TakeWhile enumerates the input sequence, emitting each item, until the given predicate is false. It then ignores the remaining elements:
int[] numbers = { 3, 5, 2, 234, 4, 1 };
var takeWhileSmall = numbers.TakeWhile (n => n < 100); // { 3, 5, 2 }
SkipWhile enumerates the input sequence, ignoring each item until the given predicate is false. It then emits the remaining elements:
int[] numbers = { 3, 5, 2, 234, 4, 1 };
var skipWhileSmall = numbers.SkipWhile (n => n < 100); // { 234, 4, 1 }
TakeWhile and SkipWhile have no translation to SQL and cause a runtime error if used in a LINQ-to-db query.
Distinct
Distinct returns the input sequence, stripped of duplicates. You can optionally pass in a custom equality comparer. The following returns distinct letters in a string:
char[] distinctLetters = "HelloWorld".Distinct().ToArray();
string s = new string (distinctLetters); // HeloWrd
We can call LINQ methods directly on a string, because string implements IEnumerable<char>.
Projecting
IEnumerable<TSource>→ IEnumerable<TResult>
Method |
Description |
SQL equivalents |
Select |
Transforms each input element with the given lambda expression |
SELECT |
SelectMany |
Transforms each input element, and then flattens and concatenates the resultant subsequences |
INNER JOIN, LEFT OUTER JOIN, CROSS JOIN |
NOTE
When querying a database, Select and SelectMany are the most versatile joining constructs; for local queries, Join and GroupJoin are the most efficient joining constructs.
Select
Argument |
Type |
Source sequence |
IEnumerable<TSource> |
Result selector |
TSource => TResult or (TSource,int) => TResult[a] |
[a] Prohibited with LINQ to SQL and Entity Framework |
Query syntax
select projection-expression
Enumerable implementation
public static IEnumerable<TResult> Select<TSource,TResult>
(this IEnumerable<TSource> source, Func<TSource,TResult> selector)
{
foreach (TSource element in source)
yield return selector (element);
}
Overview
With Select, you always get the same number of elements that you started with. Each element, however, can be transformed in any manner by the lambda function.
The following selects the names of all fonts installed on the computer (from System.Drawing):
IEnumerable<string> query = from f in FontFamily.Families
select f.Name;
foreach (string name in query) Console.WriteLine (name);
In this example, the select clause converts a FontFamily object to its name. Here’s the lambda equivalent:
IEnumerable<string> query = FontFamily.Families.Select (f => f.Name);
Select statements are often used to project into anonymous types:
var query =
from f in FontFamily.Families
select new { f.Name, LineSpacing = f.GetLineSpacing (FontStyle.Bold) };
A projection with no transformation is sometimes used with query syntax, in order to satisfy the requirement that the query end in a select or group clause. The following selects fonts supporting strikeout:
IEnumerable<FontFamily> query =
from f in FontFamily.Families
where f.IsStyleAvailable (FontStyle.Strikeout)
select f;
foreach (FontFamily ff in query) Console.WriteLine (ff.Name);
In such cases, the compiler omits the projection when translating to fluent syntax.
Indexed projection
The selector expression can optionally accept an integer argument, which acts as an indexer, providing the expression with the position of each input in the input sequence. This works only with local queries:
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" };
IEnumerable<string> query = names
.Select ((s,i) => i + "=" + s); // { "0=Tom", "1=Dick", ... }
Select subqueries and object hierarchies
You can nest a subquery in a select clause to build an object hierarchy. The following example returns a collection describing each directory under D:\source, with a subcollection of files under each directory:
DirectoryInfo[] dirs = new DirectoryInfo (@"d:\source").GetDirectories();
var query =
from d in dirs
where (d.Attributes & FileAttributes.System) == 0
select new
{
DirectoryName = d.FullName,
Created = d.CreationTime,
Files = from f in d.GetFiles()
where (f.Attributes & FileAttributes.Hidden) == 0
select new { FileName = f.Name, f.Length, }
};
foreach (var dirFiles in query)
{
Console.WriteLine ("Directory: " + dirFiles.DirectoryName);
foreach (var file in dirFiles.Files)
Console.WriteLine (" " + file.FileName + " Len: " + file.Length);
}
The inner portion of this query can be called a correlated subquery. A subquery is correlated if it references an object in the outer query—in this case, it references d, the directory being enumerated.
NOTE
A subquery inside a Select allows you to map one object hierarchy to another, or map a relational object model to a hierarchical object model.
With local queries, a subquery within a Select causes double-deferred execution. In our example, the files don’t get filtered or projected until the inner foreach statement enumerates.
Subqueries and joins in LINQ to SQL and EF
Subquery projections work well in LINQ to SQL and EF and can be used to do the work of SQL-style joins. Here’s how we retrieve each customer’s name along with their high-value purchases:
var query =
from c in dataContext.Customers
select new {
c.Name,
Purchases = from p in dataContext.Purchases
where p.CustomerID == c.ID && p.Price > 1000
select new { p.Description, p.Price }
};
foreach (var namePurchases in query)
{
Console.WriteLine ("Customer: " + namePurchases.Name);
foreach (var purchaseDetail in namePurchases.Purchases)
Console.WriteLine (" - $$$: " + purchaseDetail.Price);
}
WARNING
This style of query is ideally suited to interpreted queries. The outer query and subquery are processed as a unit, avoiding unnecessary round-tripping. With local queries, however, it’s inefficient because every combination of outer and inner elements must be enumerated to get the few matching combinations. A better choice for local queries is Join or GroupJoin, described in the following sections.
This query matches up objects from two disparate collections, and it can be thought of as a “Join.” The difference between this and a conventional database join (or subquery) is that we’re not flattening the output into a single two-dimensional result set. We’re mapping the relational data to hierarchical data, rather than to flat data.
Here’s the same query simplified by using the Purchases association property on the Customer entity:
from c in dataContext.Customers
select new
{
c.Name,
Purchases = from p in c.Purchases // Purchases is EntitySet<Purchase>
where p.Price > 1000
select new { p.Description, p.Price }
};
Both queries are analogous to a left outer join in SQL in the sense that we get all customers in the outer enumeration, regardless of whether they have any purchases. To emulate an inner join—where customers without high-value purchases are excluded—we would need to add a filter condition on the purchases collection:
from c in dataContext.Customers
where c.Purchases.Any (p => p.Price > 1000)
select new {
c.Name,
Purchases = from p in c.Purchases
where p.Price > 1000
select new { p.Description, p.Price }
};
This is slightly untidy, however, in that we’ve written the same predicate (Price > 1000) twice. We can avoid this duplication with a let clause:
from c in dataContext.Customers
let highValueP = from p in c.Purchases
where p.Price > 1000
select new { p.Description, p.Price }
where highValueP.Any()
select new { c.Name, Purchases = highValueP };
This style of query is flexible. By changing Any to Count, for instance, we can modify the query to retrieve only customers with at least two high-value purchases:
...
where highValueP.Count() >= 2
select new { c.Name, Purchases = highValueP };
Projecting into concrete types
Projecting into anonymous types is useful in obtaining intermediate results, but not so useful if you want to send a result set back to a client, for instance, because anonymous types can exist only as local variables within a method. An alternative is to use concrete types for projections, such as DataSets or custom business entity classes. A custom business entity is simply a class that you write with some properties, similar to a LINQ to SQL [Table] annotated class or an EF Entity, but designed to hide lower-level (database-related) details. You might exclude foreign key fields from business entity classes, for instance. Assuming we wrote custom entity classes called CustomerEntity and PurchaseEntity, here’s how we could project into them:
IQueryable<CustomerEntity> query =
from c in dataContext.Customers
select new CustomerEntity
{
Name = c.Name,
Purchases =
(from p in c.Purchases
where p.Price > 1000
select new PurchaseEntity {
Description = p.Description,
Value = p.Price
}
).ToList()
};
// Force query execution, converting output to a more convenient List:
List<CustomerEntity> result = query.ToList();
Notice that so far, we’ve not had to use a Join or SelectMany statement. This is because we’re maintaining the hierarchical shape of the data, as illustrated in Figure 9-2. With LINQ, you can often avoid the traditional SQL approach of flattening tables into a two-dimensional result set.
Figure 9-2. Projecting an object hierarchy
SelectMany
Argument |
Type |
Source sequence |
IEnumerable<TSource> |
Result selector |
TSource => IEnumerable<TResult> or (TSource,int) => IEnumerable<TResult>[a] |
[a] Prohibited with LINQ to SQL |
Query syntax
from identifier1 in enumerable-expression1
from identifier2 in enumerable-expression2
...
Enumerable implementation
public static IEnumerable<TResult> SelectMany<TSource,TResult>
(IEnumerable<TSource> source,
Func <TSource,IEnumerable<TResult>> selector)
{
foreach (TSource element in source)
foreach (TResult subElement in selector (element))
yield return subElement;
}
Overview
SelectMany concatenates subsequences into a single flat output sequence.
Recall that for each input element, Select yields exactly one output element. In contrast, SelectMany yields 0..n output elements. The 0..n elements come from a subsequence or child sequence that the lambda expression must emit.
SelectMany can be used to expand child sequences, flatten nested collections, and join two collections into a flat output sequence. Using the conveyor belt analogy, SelectMany funnels fresh material onto a conveyor belt. With SelectMany, each input element is the trigger for the introduction of fresh material. The fresh material is emitted by the selector lambda expression and must be a sequence. In other words, the lambda expression must emit a child sequence per input element. The final result is a concatenation of the child sequences emitted for each input element.
Starting with a simple example, suppose we have an array of names as follows:
string[] fullNames = { "Anne Williams", "John Fred Smith", "Sue Green" };
which we wish to convert to a single flat collection of words—in other words:
"Anne", "Williams", "John", "Fred", "Smith", "Sue", Green"
SelectMany is ideal for this task, because we’re mapping each input element to a variable number of output elements. All we must do is come up with a selector expression that converts each input element to a child sequence. string.Split does the job nicely: it takes a string and splits it into words, emitting the result as an array:
string testInputElement = "Anne Williams";
string[] childSequence = testInputElement.Split();
// childSequence is { "Anne", "Williams" };
So, here’s our SelectMany query and the result:
IEnumerable<string> query = fullNames.SelectMany (name => name.Split());
foreach (string name in query)
Console.Write (name + "|"); // Anne|Williams|John|Fred|Smith|Sue|Green|
NOTE
If you replace SelectMany with Select, you get the same results in hierarchical form. The following emits a sequence of string arrays, requiring nested foreach statements to enumerate:
IEnumerable<string[]> query =
fullNames.Select (name => name.Split());
foreach (string[] stringArray in query)
foreach (string name in stringArray)
Console.Write (name + "|");
The benefit of SelectMany is that it yields a single flat result sequence.
SelectMany is supported in query syntax and is invoked by having an additional generator—in other words, an extra from clause in the query. The from keyword has two meanings in query syntax. At the start of a query, it introduces the original range variable and input sequence.Anywhere else in the query, it translates to SelectMany. Here’s our query in query syntax:
IEnumerable<string> query =
from fullName in fullNames
from name in fullName.Split() // Translates to SelectMany
select name;
Note that the additional generator introduces a new range variable—in this case, name. The old range variable stays in scope, however, and we can subsequently access both.
Multiple range variables
In the preceding example, both name and fullName remain in scope until the query either ends or reaches an into clause. The extended scope of these variables is the killer scenario for query syntax over fluent syntax.
To illustrate, we can take the preceding query and include fullName in the final projection:
IEnumerable<string> query =
from fullName in fullNames
from name in fullName.Split()
select name + " came from " + fullName;
Anne came from Anne Williams
Williams came from Anne Williams
John came from John Fred Smith
...
Behind the scenes, the compiler must pull some tricks to let you access both variables. A good way to appreciate this is to try writing the same query in fluent syntax. It’s tricky! It gets harder still if you insert a where or orderby clause before projecting:
from fullName in fullNames
from name in fullName.Split()
orderby fullName, name
select name + " came from " + fullName;
The problem is that SelectMany emits a flat sequence of child elements—in our case, a flat collection of words. The original “outer” element from which it came (fullName) is lost. The solution is to “carry” the outer element with each child, in a temporary anonymous type:
from fullName in fullNames
from x in fullName.Split().Select (name => new { name, fullName } )
orderby x.fullName, x.name
select x.name + " came from " + x.fullName;
The only change here is that we’re wrapping each child element (name) in an anonymous type that also contains its fullName. This is similar to how a let clause is resolved. Here’s the final conversion to fluent syntax:
IEnumerable<string> query = fullNames
.SelectMany (fName => fName.Split()
.Select (name => new { name, fName } ))
.OrderBy (x => x.fName)
.ThenBy (x => x.name)
.Select (x => x.name + " came from " + x.fName);
Thinking in query syntax
As we just demonstrated, there are good reasons to use query syntax if you need multiple range variables. In such cases, it helps not only to use query syntax, but also to think directly in its terms.
There are two basic patterns when writing additional generators. The first is expanding and flattening subsequences. To do this, you call a property or method on an existing range variable in your additional generator. We did this in the previous example:
from fullName in fullNames
from name in fullName.Split()
Here, we’ve expanded from enumerating full names to enumerating words. An analogous LINQ-to-db query is when you expand child association properties. The following query lists all customers along with their purchases:
IEnumerable<string> query = from c in dataContext.Customers
from p in c.Purchases
select c.Name + " bought a " + p.Description;
Tom bought a Bike
Tom bought a Holiday
Dick bought a Phone
Harry bought a Car
...
Here, we’ve expanded each customer into a subsequence of purchases.
The second pattern is performing a cartesian product or cross join—where every element of one sequence is matched with every element of another. To do this, introduce a generator whose selector expression returns a sequence unrelated to a range variable:
int[] numbers = { 1, 2, 3 }; string[] letters = { "a", "b" };
IEnumerable<string> query = from n in numbers
from l in letters
select n.ToString() + l;
RESULT: { "1a", "1b", "2a", "2b", "3a", "3b" }
This style of query is the basis of SelectMany-style joins.
Joining with SelectMany
You can use SelectMany to join two sequences, simply by filtering the results of a cross product. For instance, suppose we wanted to match players for a game. We could start as follows:
string[] players = { "Tom", "Jay", "Mary" };
IEnumerable<string> query = from name1 in players
from name2 in players
select name1 + " vs " + name2;
RESULT: { "Tom vs Tom", "Tom vs Jay", "Tom vs Mary",
"Jay vs Tom", "Jay vs Jay", "Jay vs Mary",
"Mary vs Tom", "Mary vs "Jay", "Mary vs Mary" }
The query reads: “For every player, reiterate every player, selecting player 1 vs player 2.” Although we got what we asked for (a cross join), the results are not useful until we add a filter:
IEnumerable<string> query = from name1 in players
from name2 in players
where name1.CompareTo (name2) < 0
orderby name1, name2
select name1 + " vs " + name2;
RESULT: { "Jay vs Mary", "Jay vs Tom", "Mary vs Tom" }
The filter predicate constitutes the join condition. Our query can be called a non-equi join, because the join condition doesn’t use an equality operator.
We’ll demonstrate the remaining types of joins with LINQ to SQL (they’ll also work with EF except where we explicitly use a foreign key field).
SelectMany in LINQ to SQL and EF
SelectMany in LINQ to SQL and EF can perform cross joins, non-equi joins, inner joins, and left outer joins. You can use SelectMany with both predefined associations and ad hoc relationships—just as with Select. The difference is that SelectMany returns a flat rather than a hierarchical result set.
A LINQ-to-db cross join is written just as in the preceding section. The following query matches every customer to every purchase (a cross join):
var query = from c in dataContext.Customers
from p in dataContext.Purchases
select c.Name + " might have bought a " + p.Description;
More typically, though, you’d want to match customers to their own purchases only. You achieve this by adding a where clause with a joining predicate. This results in a standard SQL-style equi-join:
var query = from c in dataContext.Customers
from p in dataContext.Purchases
where c.ID == p.CustomerID
select c.Name + " bought a " + p.Description;
NOTE
This translates well to SQL. In the next section, we’ll see how it extends to support outer joins. Reformulating such queries with LINQ’s Join operator actually makes them less extensible—LINQ is opposite to SQL in this sense.
If you have association properties for relationships in your entities, you can express the same query by expanding the subcollection instead of filtering the cross product:
from c in dataContext.Customers
from p in c.Purchases
select new { c.Name, p.Description };
NOTE
Entity Framework doesn’t expose foreign keys in the entities, so for recognized relationships you must use its association properties rather than joining manually as we did previously.
The advantage is that we’ve eliminated the joining predicate. We’ve gone from filtering a cross product to expanding and flattening. Both queries, however, will result in the same SQL.
You can add where clauses to such a query for additional filtering. For instance, if we wanted only customers whose names started with “T”, we could filter as follows:
from c in dataContext.Customers
where c.Name.StartsWith ("T")
from p in c.Purchases
select new { c.Name, p.Description };
This LINQ-to-db query would work equally well if the where clause is moved one line down. If it is a local query, however, moving the where clause down would make it less efficient. With local queries, you should filter before joining.
You can introduce new tables into the mix with additional from clauses. For instance, if each purchase had purchase item child rows, you could produce a flat result set of customers with their purchases, each with their purchase detail lines as follows:
from c in dataContext.Customers
from p in c.Purchases
from pi in p.PurchaseItems
select new { c.Name, p.Description, pi.DetailLine };
Each from clause introduces a new child table. To include data from a parent table (via an association property), you don’t add a from clause—you simply navigate to the property. For example, if each customer has a salesperson whose name you want to query, just do this:
from c in dataContext.Customers
select new { Name = c.Name, SalesPerson = c.SalesPerson.Name };
You don’t use SelectMany in this case because there’s no subcollection to flatten. Parent association properties return a single item.
Outer joins with SelectMany
We saw previously that a Select subquery yields a result analogous to a left outer join.
from c in dataContext.Customers
select new {
c.Name,
Purchases = from p in c.Purchases
where p.Price > 1000
select new { p.Description, p.Price }
};
In this example, every outer element (customer) is included, regardless of whether the customer has any purchases. But suppose we rewrite this query with SelectMany, so we can obtain a single flat collection rather than a hierarchical result set:
from c in dataContext.Customers
from p in c.Purchases
where p.Price > 1000
select new { c.Name, p.Description, p.Price };
In the process of flattening the query, we’ve switched to an inner join: customers are now included only for whom one or more high-value purchases exist. To get a left outer join with a flat result set, we must apply the DefaultIfEmpty query operator on the inner sequence. This method returns a sequence with a single null element if its input sequence has no elements. Here’s such a query, price predicate aside:
from c in dataContext.Customers
from p in c.Purchases.DefaultIfEmpty()
select new { c.Name, p.Description, Price = (decimal?) p.Price };
This works perfectly with LINQ to SQL and EF, returning all customers, even if they have no purchases. But if we were to run this as a local query, it would crash, because when p is null, p.Description and p.Price throw a NullReferenceException. We can make our query robust in either scenario as follows:
from c in dataContext.Customers
from p in c.Purchases.DefaultIfEmpty()
select new {
c.Name,
Descript = p == null ? null : p.Description,
Price = p == null ? (decimal?) null : p.Price
};
Let’s now reintroduce the price filter. We cannot use a where clause as we did before, because it would execute after DefaultIfEmpty:
from c in dataContext.Customers
from p in c.Purchases.DefaultIfEmpty()
where p.Price > 1000...
The correct solution is to splice the Where clause before DefaultIfEmpty with a subquery:
from c in dataContext.Customers
from p in c.Purchases.Where (p => p.Price > 1000).DefaultIfEmpty()
select new {
c.Name,
Descript = p == null ? null : p.Description,
Price = p == null ? (decimal?) null : p.Price
};
LINQ to SQL and EF translate this to a left outer join. This is an effective pattern for writing such queries.
NOTE
If you’re used to writing outer joins in SQL, you might be tempted to overlook the simpler option of a Select subquery for this style of query, in favor of the awkward but familiar SQL-centric flat approach. The hierarchical result set from a Select subquery is often better suited to outer join-style queries because there are no additional nulls to deal with.
Joining
Method |
Description |
SQL equivalents |
Join |
Applies a lookup strategy to match elements from two collections, emitting a flat result set |
INNER JOIN |
GroupJoin |
As above, but emits a hierarchical result set |
INNER JOIN, LEFT OUTER JOIN |
Zip |
Enumerates two sequences in step (like a zipper), applying a function over each element pair |
Join and GroupJoin
IEnumerable<TOuter>, IEnumerable<TInner>→IEnumerable<TResult>
Join arguments
Argument |
Type |
Outer sequence |
IEnumerable<TOuter> |
Inner sequence |
IEnumerable<TInner> |
Outer key selector |
TOuter => TKey |
Inner key selector |
TInner => TKey |
Result selector |
(TOuter,TInner) => TResult |
GroupJoin arguments
Argument |
Type |
Outer sequence |
IEnumerable<TOuter> |
Inner sequence |
IEnumerable<TInner> |
Outer key selector |
TOuter => TKey |
Inner key selector |
TInner => TKey |
Result selector |
(TOuter,IEnumerable<TInner>) => TResult |
Query syntax
from outer-var in outer-enumerable
join inner-var in inner-enumerable on outer-key-expr equals inner-key-expr
[ into identifier ]
Overview
Join and GroupJoin mesh two input sequences into a single output sequence. Join emits flat output; GroupJoin emits hierarchical output.
Join and GroupJoin provide an alternative strategy to Select and SelectMany. The advantage of Join and GroupJoin is that they execute efficiently over local in-memory collections, since they first load the inner sequence into a keyed lookup, avoiding the need to repeatedly enumerate over every inner element. The disadvantage is that they offer the equivalent of inner and left outer joins only; cross joins and non-equi joins must still be done with Select/SelectMany. With LINQ to SQL and Entity Framework queries, Join and GroupJoin offer no real benefits over Select and SelectMany.
Table 9-1 summarizes the differences between each of the joining strategies.
Table 9-1. Joining strategies
Strategy |
Result shape |
Local query efficiency |
Inner joins |
Left outer joins |
Cross joins |
Non-equi joins |
Select + SelectMany |
Flat |
Bad |
Yes |
Yes |
Yes |
Yes |
Select + Select |
Nested |
Bad |
Yes |
Yes |
Yes |
Yes |
Join |
Flat |
Good |
Yes |
- |
- |
- |
GroupJoin |
Nested |
Good |
Yes |
Yes |
- |
- |
GroupJoin + SelectMany |
Flat |
Good |
Yes |
Yes |
- |
- |
Join
The Join operator performs an inner join, emitting a flat output sequence.
NOTE
Entity Framework hides foreign key fields, so you can’t manually join across natural relationships (instead, you can query across association properties, as we described in the previous two sections).
The simplest way to demonstrate Join is with LINQ to SQL. The following query lists all customers alongside their purchases, without using an association property:
IQueryable<string> query =
from c in dataContext.Customers
join p in dataContext.Purchases on c.ID equals p.CustomerID
select c.Name + " bought a " + p.Description;
The results match what we would get from a SelectMany-style query:
Tom bought a Bike
Tom bought a Holiday
Dick bought a Phone
Harry bought a Car
To see the benefit of Join over SelectMany, we must convert this to a local query. We can demonstrate this by first copying all customers and purchases to arrays, and then querying the arrays:
Customer[] customers = dataContext.Customers.ToArray();
Purchase[] purchases = dataContext.Purchases.ToArray();
var slowQuery = from c in customers
from p in purchases where c.ID == p.CustomerID
select c.Name + " bought a " + p.Description;
var fastQuery = from c in customers
join p in purchases on c.ID equals p.CustomerID
select c.Name + " bought a " + p.Description;
Although both queries yield the same results, the Join query is considerably faster because its implementation in Enumerable preloads the inner collection (purchases) into a keyed lookup.
The query syntax for join can be written in general terms as follows:
join inner-var in inner-sequence on outer-key-expr equals inner-key-expr
Join operators in LINQ differentiate between the outer sequence and inner sequence. Syntactically:
§ The outer sequence is the input sequence (in this case, customers).
§ The inner sequence is the new collection you introduce (in this case, purchases).
Join performs inner joins, meaning customers without purchases are excluded from the output. With inner joins, you can swap the inner and outer sequences in the query and still get the same results:
from p in purchases // p is now outer
join c in customers on p.CustomerID equals c.ID // c is now inner
...
You can add further join clauses to the same query. If each purchase, for instance, has one or more purchase items, you could join the purchase items as follows:
from c in customers
join p in purchases on c.ID equals p.CustomerID // first join
join pi in purchaseItems on p.ID equals pi.PurchaseID // second join
...
purchases acts as the inner sequence in the first join and as the outer sequence in the second join. You could obtain the same results (inefficiently) using nested foreach statements as follows:
foreach (Customer c in customers)
foreach (Purchase p in purchases)
if (c.ID == p.CustomerID)
foreach (PurchaseItem pi in purchaseItems)
if (p.ID == pi.PurchaseID)
Console.WriteLine (c.Name + "," + p.Price + "," + pi.Detail);
In query syntax, variables from earlier joins remain in scope—just as they do with SelectMany-style queries. You’re also permitted to insert where and let clauses in between join clauses.
Joining on multiple keys
You can join on multiple keys with anonymous types as follows:
from x in sequenceX
join y in sequenceY on new { K1 = x.Prop1, K2 = x.Prop2 }
equals new { K1 = y.Prop3, K2 = y.Prop4 }
...
For this to work, the two anonymous types must be structured identically. The compiler then implements each with the same internal type, making the joining keys compatible.
Joining in fluent syntax
The following query syntax join:
from c in customers
join p in purchases on c.ID equals p.CustomerID
select new { c.Name, p.Description, p.Price };
in fluent syntax is as follows:
customers.Join ( // outer collection
purchases, // inner collection
c => c.ID, // outer key selector
p => p.CustomerID, // inner key selector
(c, p) => new
{ c.Name, p.Description, p.Price } // result selector
);
The result selector expression at the end creates each element in the output sequence. If you have additional clauses prior to projecting, such as orderby in this example:
from c in customers
join p in purchases on c.ID equals p.CustomerID
orderby p.Price
select c.Name + " bought a " + p.Description;
you must manufacture a temporary anonymous type in the result selector in fluent syntax. This keeps both c and p in scope following the join:
customers.Join ( // outer collection
purchases, // inner collection
c => c.ID, // outer key selector
p => p.CustomerID, // inner key selector
(c, p) => new { c, p } ) // result selector
.OrderBy (x => x.p.Price)
.Select (x => x.c.Name + " bought a " + x.p.Description);
Query syntax is usually preferable when joining; it’s less fiddly.
GroupJoin
GroupJoin does the same work as Join, but instead of yielding a flat result, it yields a hierarchical result, grouped by each outer element. It also allows left outer joins.
The query syntax for GroupJoin is the same as for Join, but is followed by the into keyword.
Here’s the most basic example:
IEnumerable<IEnumerable<Purchase>> query =
from c in customers
join p in purchases on c.ID equals p.CustomerID
into custPurchases
select custPurchases; // custPurchases is a sequence
NOTE
An into clause translates to GroupJoin only when it appears directly after a join clause. After a select or group clause, it means query continuation. The two uses of the into keyword are quite different, although they have one feature in common: they both introduce a new range variable.
The result is a sequence of sequences, which we could enumerate as follows:
foreach (IEnumerable<Purchase> purchaseSequence in query)
foreach (Purchase p in purchaseSequence)
Console.WriteLine (p.Description);
This isn’t very useful, however, because purchaseSequence has no reference to the customer. More commonly, you’d do this:
from c in customers
join p in purchases on c.ID equals p.CustomerID
into custPurchases
select new { CustName = c.Name, custPurchases };
This gives the same results as the following (inefficient) Select subquery:
from c in customers
select new
{
CustName = c.Name,
custPurchases = purchases.Where (p => c.ID == p.CustomerID)
};
By default, GroupJoin does the equivalent of a left outer join. To get an inner join—where customers without purchases are excluded—you need to filter on custPurchases:
from c in customers join p in purchases on c.ID equals p.CustomerID
into custPurchases
where custPurchases.Any()
select ...
Clauses after a group-join into operate on subsequences of inner child elements, not individual child elements. This means that to filter individual purchases, you’d have to call Where before joining:
from c in customers
join p in purchases.Where (p2 => p2.Price > 1000)
on c.ID equals p.CustomerID
into custPurchases ...
You can construct lambda queries with GroupJoin as you would with Join.
Flat outer joins
You run into a dilemma if you want both an outer join and a flat result set. GroupJoin gives you the outer join; Join gives you the flat result set. The solution is to first call GroupJoin, and then DefaultIfEmpty on each child sequence, and then finally SelectMany on the result:
from c in customers
join p in purchases on c.ID equals p.CustomerID into custPurchases
from cp in custPurchases.DefaultIfEmpty()
select new
{
CustName = c.Name,
Price = cp == null ? (decimal?) null : cp.Price
};
DefaultIfEmpty emits a null value if a subsequence of purchases is empty. The second from clause translates to SelectMany. In this role, it expands and flattens all the purchase subsequences, concatenating them into a single sequence of purchase elements.
Joining with lookups
The Join and GroupJoin methods in Enumerable work in two steps. First, they load the inner sequence into a lookup. Second, they query the outer sequence in combination with the lookup.
A lookup is a sequence of groupings that can be accessed directly by key. Another way to think of it is as a dictionary of sequences—a dictionary that can accept many elements under each key (sometimes called a multi-dictionary).
Lookups are read-only and defined by the following interface:
public interface ILookup<TKey,TElement> :
IEnumerable<IGrouping<TKey,TElement>>, IEnumerable
{
int Count { get; }
bool Contains (TKey key);
IEnumerable<TElement> this [TKey key] { get; }
}
NOTE
The joining operators—like other sequence-emitting operators—honor deferred or lazy execution semantics. This means the lookup is not built until you begin enumerating the output sequence (and then the entire lookup is built right then).
You can create and query lookups manually as an alternative strategy to using the joining operators, when dealing with local collections. There are a couple of benefits in doing so:
§ You can reuse the same lookup over multiple queries—as well as in ordinary imperative code.
§ Querying a lookup is an excellent way of understanding how Join and GroupJoin work.
The ToLookup extension method creates a lookup. The following loads all purchases into a lookup—keyed by their CustomerID:
ILookup<int?,Purchase> purchLookup =
purchases.ToLookup (p => p.CustomerID, p => p);
The first argument selects the key; the second argument selects the objects that are to be loaded as values into the lookup.
Reading a lookup is rather like reading a dictionary, except that the indexer returns a sequence of matching items, rather than a single matching item. The following enumerates all purchases made by the customer whose ID is 1:
foreach (Purchase p in purchLookup [1])
Console.WriteLine (p.Description);
With a lookup in place, you can write SelectMany/Select queries that execute as efficiently as Join/GroupJoin queries. Join is equivalent to using SelectMany on a lookup:
from c in customers
from p in purchLookup [c.ID]
select new { c.Name, p.Description, p.Price };
Tom Bike 500
Tom Holiday 2000
Dick Bike 600
Dick Phone 300
...
Adding a call to DefaultIfEmpty makes this into an outer join:
from c in customers
from p in purchLookup [c.ID].DefaultIfEmpty()
select new {
c.Name,
Descript = p == null ? null : p.Description,
Price = p == null ? (decimal?) null : p.Price
};
GroupJoin is equivalent to reading the lookup inside a projection:
from c in customers
select new {
CustName = c.Name,
CustPurchases = purchLookup [c.ID]
};
Enumerable implementations
Here’s the simplest valid implementation of Enumerable.Join, null checking aside:
public static IEnumerable <TResult> Join
<TOuter,TInner,TKey,TResult> (
this IEnumerable <TOuter> outer,
IEnumerable <TInner> inner,
Func <TOuter,TKey> outerKeySelector,
Func <TInner,TKey> innerKeySelector,
Func <TOuter,TInner,TResult> resultSelector)
{
ILookup <TKey, TInner> lookup = inner.ToLookup (innerKeySelector);
return
from outerItem in outer
from innerItem in lookup [outerKeySelector (outerItem)]
select resultSelector (outerItem, innerItem);
}
GroupJoin’s implementation is like that of Join, but simpler:
public static IEnumerable <TResult> GroupJoin
<TOuter,TInner,TKey,TResult> (
this IEnumerable <TOuter> outer,
IEnumerable <TInner> inner,
Func <TOuter,TKey> outerKeySelector,
Func <TInner,TKey> innerKeySelector,
Func <TOuter,IEnumerable<TInner>,TResult> resultSelector)
{
ILookup <TKey, TInner> lookup = inner.ToLookup (innerKeySelector);
return
from outerItem in outer
select resultSelector
(outerItem, lookup [outerKeySelector (outerItem)]);
}
The Zip Operator
IEnumerable<TFirst>, IEnumerable<TSecond>→ IEnumerable<TResult>
The Zip operator was added in Framework 4.0. It enumerates two sequences in step (like a zipper), returning a sequence based on applying a function over each element pair. For instance, the following:
int[] numbers = { 3, 5, 7 };
string[] words = { "three", "five", "seven", "ignored" };
IEnumerable<string> zip = numbers.Zip (words, (n, w) => n + "=" + w);
produces a sequence with the following elements:
3=three
5=five
7=seven
Extra elements in either input sequence are ignored. Zip is not supported when querying a database.
Ordering
IEnumerable<TSource>→ IOrderedEnumerable<TSource>
Method |
Description |
SQL equivalents |
OrderBy, ThenBy |
Sorts a sequence in ascending order |
ORDER BY ... |
OrderByDescending, ThenByDescending |
Sorts a sequence in descending order |
ORDER BY ... DESC |
Reverse |
Returns a sequence in reverse order |
Exception thrown |
Ordering operators return the same elements in a different order.
OrderBy, OrderByDescending, ThenBy, and ThenByDescending
OrderBy and OrderByDescending arguments
Argument |
Type |
Input sequence |
IEnumerable<TSource> |
Key selector |
TSource => TKey |
Return type = IOrderedEnumerable<TSource>
ThenBy and ThenByDescending arguments
Argument |
Type |
Input sequence |
IOrderedEnumerable<TSource> |
Key selector |
TSource => TKey |
Query syntax
orderby expression1 [descending] [, expression2 [descending] ... ]
Overview
OrderBy returns a sorted version of the input sequence, using the keySelector expression to make comparisons. The following query emits a sequence of names in alphabetical order:
IEnumerable<string> query = names.OrderBy (s => s);
The following sorts names by length:
IEnumerable<string> query = names.OrderBy (s => s.Length);
// Result: { "Jay", "Tom", "Mary", "Dick", "Harry" };
The relative order of elements with the same sorting key (in this case, Jay/Tom and Mary/Dick) is indeterminate—unless you append a ThenBy operator:
IEnumerable<string> query = names.OrderBy (s => s.Length).ThenBy (s => s);
// Result: { "Jay", "Tom", "Dick", "Mary", "Harry" };
ThenBy reorders only elements that had the same sorting key in the preceding sort. You can chain any number of ThenBy operators. The following sorts first by length, then by the second character, and finally by the first character:
names.OrderBy (s => s.Length).ThenBy (s => s[1]).ThenBy (s => s[0]);
The equivalent in query syntax is this:
from s in names
orderby s.Length, s[1], s[0]
select s;
WARNING
The following variation is incorrect—it will actually order first by s[1], and then by s.Length (or in the case of a database query, it will order only by s[1] and discard the former ordering):
from s in names
orderby s.Length
orderby s[1]
...
LINQ also provides OrderByDescending and ThenByDescending operators, which do the same things, emitting the results in reverse order. The following LINQ-to-db query retrieves purchases in descending order of price, with those of the same price listed alphabetically:
dataContext.Purchases.OrderByDescending (p => p.Price)
.ThenBy (p => p.Description);
In query syntax:
from p in dataContext.Purchases
orderby p.Price descending, p.Description
select p;
Comparers and collations
In a local query, the key selector objects themselves determine the ordering algorithm via their default IComparable implementation (see Chapter 7). You can override the sorting algorithm by passing in an IComparer object. The following performs a case-insensitive sort:
names.OrderBy (n => n, StringComparer.CurrentCultureIgnoreCase);
Passing in a comparer is not supported in query syntax, nor in any way by LINQ to SQL or EF. When querying a database, the comparison algorithm is determined by the participating column’s collation. If the collation is case-sensitive, you can request a case-insensitive sort by callingToUpper in the key selector:
from p in dataContext.Purchases
orderby p.Description.ToUpper()
select p;
IOrderedEnumerable and IOrderedQueryable
The ordering operators return special subtypes of IEnumerable<T>. Those in Enumerable return IOrderedEnumerable; those in Queryable return IOrderedQueryable. These subtypes allow a subsequent ThenBy operator to refine rather than replace the existing ordering.
The additional members that these subtypes define are not publicly exposed, so they present like ordinary sequences. The fact that they are different types comes into play when building queries progressively:
IOrderedEnumerable<string> query1 = names.OrderBy (s => s.Length);
IOrderedEnumerable<string> query2 = query1.ThenBy (s => s);
If we instead declare query1 of type IEnumerable<string>, the second line would not compile—ThenBy requires an input of type IOrderedEnumerable<string>. You can avoid worrying about this by implicitly typing range variables:
var query1 = names.OrderBy (s => s.Length);
var query2 = query1.ThenBy (s => s);
Implicit typing can create problems of its own, though. The following will not compile:
var query = names.OrderBy (s => s.Length);
query = query.Where (n => n.Length > 3); // Compile-time error
The compiler infers query to be of type IOrderedEnumerable<string>, based on OrderBy’s output sequence type. However, the Where on the next line returns an ordinary IEnumerable<string>, which cannot be assigned back to query. You can work around this either with explicit typing or by calling AsEnumerable() after OrderBy:
var query = names.OrderBy (s => s.Length).AsEnumerable();
query = query.Where (n => n.Length > 3); // OK
The equivalent in interpreted queries is to call AsQueryable.
Grouping
IEnumerable<TSource>→ IEnumerable<IGrouping<TSource,TElement>>
Method |
Description |
SQL equivalents |
GroupBy |
Groups a sequence into subsequences |
GROUP BY |
GroupBy
Argument |
Type |
Input sequence |
IEnumerable<TSource> |
Key selector |
TSource => TKey |
Element selector (optional) |
TSource => TElement |
Comparer (optional) |
IEqualityComparer<TKey> |
Query syntax
group element-expression by key-expression
Overview
GroupBy organizes a flat input sequence into sequences of groups. For example, the following organizes all the files in c:\temp by extension:
string[] files = Directory.GetFiles ("c:\\temp");
IEnumerable<IGrouping<string,string>> query =
files.GroupBy (file => Path.GetExtension (file));
Or if you’re comfortable with implicit typing:
var query = files.GroupBy (file => Path.GetExtension (file));
Here’s how to enumerate the result:
foreach (IGrouping<string,string> grouping in query)
{
Console.WriteLine ("Extension: " + grouping.Key);
foreach (string filename in grouping)
Console.WriteLine (" - " + filename);
}
Extension: .pdf
-- chapter03.pdf
-- chapter04.pdf
Extension: .doc
-- todo.doc
-- menu.doc
-- Copy of menu.doc
...
Enumerable.GroupBy works by reading the input elements into a temporary dictionary of lists so that all elements with the same key end up in the same sublist. It then emits a sequence of groupings. A grouping is a sequence with a Key property:
public interface IGrouping <TKey,TElement> : IEnumerable<TElement>,
IEnumerable
{
TKey Key { get; } // Key applies to the subsequence as a whole
}
By default, the elements in each grouping are untransformed input elements, unless you specify an elementSelector argument. The following projects each input element to uppercase:
files.GroupBy (file => Path.GetExtension (file), file => file.ToUpper());
An elementSelector is independent of the keySelector. In our case, this means that the Key on each grouping is still in its original case:
Extension: .pdf
-- CHAPTER03.PDF
-- CHAPTER04.PDF
Extension: .doc
-- TODO.DOC
Note that the subcollections are not emitted in alphabetical order of key. GroupBy groups only; it does not sort; in fact, it preserves the original ordering. To sort, you must add an OrderBy operator:
files.GroupBy (file => Path.GetExtension (file), file => file.ToUpper())
.OrderBy (grouping => grouping.Key);
GroupBy has a simple and direct translation in query syntax:
group element-expr by key-expr
Here’s our example in query syntax:
from file in files
group file.ToUpper() by Path.GetExtension (file);
As with select, group “ends” a query—unless you add a query continuation clause:
from file in files
group file.ToUpper() by Path.GetExtension (file) into grouping
orderby grouping.Key
select grouping;
Query continuations are often useful in a group by query. The next query filters out groups that have fewer than five files in them:
from file in files
group file.ToUpper() by Path.GetExtension (file) into grouping
where grouping.Count() >= 5
select grouping;
NOTE
A where after a group by is equivalent to HAVING in SQL. It applies to each subsequence or grouping as a whole, rather than the individual elements.
Sometimes you’re interested purely in the result of an aggregation on a grouping and so can abandon the subsequences:
string[] votes = { "Bush", "Gore", "Gore", "Bush", "Bush" };
IEnumerable<string> query = from vote in votes
group vote by vote into g
orderby g.Count() descending
select g.Key;
string winner = query.First(); // Bush
GroupBy in LINQ to SQL and EF
Grouping works in the same way when querying a database. If you have association properties set up, you’ll find, however, that the need to group arises less frequently than with standard SQL. For instance, to select customers with at least two purchases, you don’t need to group; the following query does the job nicely:
from c in dataContext.Customers
where c.Purchases.Count >= 2
select c.Name + " has made " + c.Purchases.Count + " purchases";
An example of when you might use grouping is to list total sales by year:
from p in dataContext.Purchases
group p.Price by p.Date.Year into salesByYear
select new {
Year = salesByYear.Key,
TotalValue = salesByYear.Sum()
};
LINQ’s grouping operators expose a superset of SQL’s “GROUP BY” functionality.
Another departure from traditional SQL comes in there being no obligation to project the variables or expressions used in grouping or sorting.
Grouping by multiple keys
You can group by a composite key, using an anonymous type:
from n in names
group n by new { FirstLetter = n[0], Length = n.Length };
Custom equality comparers
You can pass a custom equality comparer into GroupBy, in a local query, to change the algorithm for key comparison. Rarely is this required, though, because changing the key selector expression is usually sufficient.
For instance, the following creates a case-insensitive grouping:
group name by name.ToUpper()
Set Operators
IEnumerable<TSource>, IEnumerable<TSource>→IEnumerable<TSource>
Method |
Description |
SQL equivalents |
Concat |
Returns a concatenation of elements in each of the two sequences |
UNION ALL |
Union |
Returns a concatenation of elements in each of the two sequences, excluding duplicates |
UNION |
Intersect |
Returns elements present in both sequences |
WHERE ... IN (...) |
Except |
Returns elements present in the first, but not the second sequence |
EXCEPT or WHERE ... NOT IN (...) |
Concat and Union
Concat returns all the elements of the first sequence, followed by all the elements of the second. Union does the same, but removes any duplicates:
int[] seq1 = { 1, 2, 3 }, seq2 = { 3, 4, 5 };
IEnumerable<int>
concat = seq1.Concat (seq2), // { 1, 2, 3, 3, 4, 5 }
union = seq1.Union (seq2); // { 1, 2, 3, 4, 5 }
Specifying the type argument explicitly is useful when the sequences are differently typed, but the elements have a common base type. For instance, with the reflection API (Chapter 19), methods and properties are represented with MethodInfo and PropertyInfo classes, which have a common base class called MemberInfo. We can concatenate methods and properties by stating that base class explicitly when calling Concat:
MethodInfo[] methods = typeof (string).GetMethods();
PropertyInfo[] props = typeof (string).GetProperties();
IEnumerable<MemberInfo> both = methods.Concat<MemberInfo> (props);
In the next example, we filter the methods before concatenating:
var methods = typeof (string).GetMethods().Where (m => !m.IsSpecialName);
var props = typeof (string).GetProperties();
var both = methods.Concat<MemberInfo> (props);
This example relies on interface type parameter variance: methods is of type IEnumerable<MethodInfo>, which requires a covariant conversion to IEnumerable<MemberInfo>. It’s a good illustration of how variance makes things work more as you’d expect.
Intersect and Except
Intersect returns the elements that two sequences have in common. Except returns the elements in the first input sequence that are not present in the second:
int[] seq1 = { 1, 2, 3 }, seq2 = { 3, 4, 5 };
IEnumerable<int>
commonality = seq1.Intersect (seq2), // { 3 }
difference1 = seq1.Except (seq2), // { 1, 2 }
difference2 = seq2.Except (seq1); // { 4, 5 }
Enumerable.Except works internally by loading all of the elements in the first collection into a dictionary, then removing from the dictionary all elements present in the second sequence. The equivalent in SQL is a NOT EXISTS or NOT IN subquery:
SELECT number FROM numbers1Table
WHERE number NOT IN (SELECT number FROM numbers2Table)
Conversion Methods
LINQ deals primarily in sequences—in other words, collections of type IEnumerable<T>. The conversion methods convert to and from other types of collections:
Method |
Description |
OfType |
Converts IEnumerable to IEnumerable<T>, discarding wrongly typed elements |
Cast |
Converts IEnumerable to IEnumerable<T>, throwing an exception if there are any wrongly typed elements |
ToArray |
Converts IEnumerable<T> to T[] |
ToList |
Converts IEnumerable<T> to List<T> |
ToDictionary |
Converts IEnumerable<T> to Dictionary<TKey,TValue> |
ToLookup |
Converts IEnumerable<T> to ILookup<TKey,TElement> |
AsEnumerable |
Downcasts to IEnumerable<T> |
AsQueryable |
Casts or converts to IQueryable<T> |
OfType and Cast
OfType and Cast accept a nongeneric IEnumerable collection and emit a generic IEnumerable<T> sequence that you can subsequently query:
ArrayList classicList = new ArrayList(); // in System.Collections
classicList.AddRange ( new int[] { 3, 4, 5 } );
IEnumerable<int> sequence1 = classicList.Cast<int>();
Cast and OfType differ in their behavior when encountering an input element that’s of an incompatible type. Cast throws an exception; OfType ignores the incompatible element. Continuing the preceding example:
DateTime offender = DateTime.Now;
classicList.Add (offender);
IEnumerable<int>
sequence2 = classicList.OfType<int>(), // OK - ignores offending DateTime
sequence3 = classicList.Cast<int>(); // Throws exception
The rules for element compatibility exactly follow those of C#’s is operator, and therefore consider only reference conversions and unboxing conversions. We can see this by examining the internal implementation of OfType:
public static IEnumerable<TSource> OfType <TSource> (IEnumerable source)
{
foreach (object element in source)
if (element is TSource)
yield return (TSource)element;
}
Cast has an identical implementation, except that it omits the type compatibility test:
public static IEnumerable<TSource> Cast <TSource> (IEnumerable source)
{
foreach (object element in source)
yield return (TSource)element;
}
A consequence of these implementations is that you cannot use Cast to perform numeric or custom conversions (for these, you must perform a Select operation instead). In other words, Cast is not as flexible as C#’s cast operator:
int i = 3;
long l = i; // Implicit numeric conversion int->long
int i2 = (int) l; // Explicit numeric conversion long->int
We can demonstrate this by attempting to use OfType or Cast to convert a sequence of ints to a sequence of longs:
int[] integers = { 1, 2, 3 };
IEnumerable<long> test1 = integers.OfType<long>();
IEnumerable<long> test2 = integers.Cast<long>();
When enumerated, test1 emits zero elements and test2 throws an exception. Examining OfType’s implementation, it’s fairly clear why. After substituting TSource, we get the following expression:
(element is long)
which returns false for an int element, due to the lack of an inheritance relationship.
NOTE
The reason for test2 throwing an exception, when enumerated, is subtler. Notice in Cast’s implementation that element is of type object. When TSource is a value type, the CLR assumes this is an unboxing conversion, and synthesizes a method that reproduces the scenario described in the section Boxing and Unboxing in Chapter 3:
int value = 123;
object element = value;
long result = (long) element; // exception
Because the element variable is declared of type object, an object-to-long cast is performed (an unboxing) rather than an int-to-long numeric conversion. Unboxing operations require an exact type match, so the object-to-long unbox fails when given an int.
As we suggested previously, the solution is to use an ordinary Select:
IEnumerable<long> castLong = integers.Select (s => (long) s);
OfType and Cast are also useful in downcasting elements in a generic input sequence. For instance, if you have an input sequence of type IEnumerable<Fruit>, OfType<Apple> would return just the apples. This is particularly useful in LINQ to XML (see Chapter 10).
Cast has query syntax support: simply precede the range variable with a type:
from TreeNode node in myTreeView.Nodes
...
ToArray, ToList, ToDictionary, and ToLookup
ToArray and ToList emit the results into an array or generic list. These operators force the immediate enumeration of the input sequence (unless indirected via a subquery or expression tree). For examples, refer to the section Deferred Execution in Chapter 8.
ToDictionary and ToLookup accept the following arguments:
Argument |
Type |
Input sequence |
IEnumerable<TSource> |
Key selector |
TSource => TKey |
Element selector (optional) |
TSource => TElement |
Comparer (optional) |
IEqualityComparer<TKey> |
ToDictionary also forces immediate execution of a sequence, writing the results to a generic Dictionary. The keySelector expression you provide must evaluate to a unique value for each element in the input sequence; otherwise, an exception is thrown. In contrast, ToLookup allows many elements of the same key. We describe lookups in the earlier section Joining with lookups.
AsEnumerable and AsQueryable
AsEnumerable upcasts a sequence to IEnumerable<T>, forcing the compiler to bind subsequent query operators to methods in Enumerable, instead of Queryable. For an example, see the section Combining Interpreted and Local Queries in Chapter 8.
AsQueryable downcasts a sequence to IQueryable<T> if it implements that interface. Otherwise, it instantiates an IQueryable<T> wrapper over the local query.
Element Operators
IEnumerable<TSource>→ TSource
Method |
Description |
SQL equivalents |
First, FirstOrDefault |
Returns the first element in the sequence, optionally satisfying a predicate |
SELECT TOP 1 ... ORDER BY ... |
Last, LastOrDefault |
Returns the last element in the sequence, optionally satisfying a predicate |
SELECT TOP 1 ... ORDER BY ... DESC |
Single, SingleOrDefault |
Equivalent to First/FirstOrDefault, but throws an exception if there is more than one match |
|
ElementAt, ElementAtOrDefault |
Returns the element at the specified position |
Exception thrown |
DefaultIfEmpty |
Returns a single-element sequence whose value is default(TSource) if the sequence has no elements |
OUTER JOIN |
Methods ending in “OrDefault” return default(TSource) rather than throwing an exception if the input sequence is empty or if no elements match the supplied predicate.
default(TSource) is null for reference type elements, false for the bool type and zero for numeric types.
First, Last, and Single
Argument |
Type |
Source sequence |
IEnumerable<TSource> |
Predicate (optional) |
TSource => bool |
The following example demonstrates First and Last:
int[] numbers = { 1, 2, 3, 4, 5 };
int first = numbers.First(); // 1
int last = numbers.Last(); // 5
int firstEven = numbers.First (n => n % 2 == 0); // 2
int lastEven = numbers.Last (n => n % 2 == 0); // 4
The following demonstrates First versus FirstOrDefault:
int firstBigError = numbers.First (n => n > 10); // Exception
int firstBigNumber = numbers.FirstOrDefault (n => n > 10); // 0
To avoid an exception, Single requires exactly one matching element; SingleOrDefault requires one or zero matching elements:
int onlyDivBy3 = numbers.Single (n => n % 3 == 0); // 3
int divBy2Err = numbers.Single (n => n % 2 == 0); // Error: 2 & 4 match
int singleError = numbers.Single (n => n > 10); // Error
int noMatches = numbers.SingleOrDefault (n => n > 10); // 0
int divBy2Error = numbers.SingleOrDefault (n => n % 2 == 0); // Error
Single is the “fussiest” in this family of element operators. FirstOrDefault and LastOrDefault are the most tolerant.
In LINQ to SQL and EF, Single is often used to retrieve a row from a table by primary key:
Customer cust = dataContext.Customers.Single (c => c.ID == 3);
ElementAt
Argument |
Type |
Source sequence |
IEnumerable<TSource> |
Index of element to return |
int |
ElementAt picks the nth element from the sequence:
int[] numbers = { 1, 2, 3, 4, 5 };
int third = numbers.ElementAt (2); // 3
int tenthError = numbers.ElementAt (9); // Exception
int tenth = numbers.ElementAtOrDefault (9); // 0
Enumerable.ElementAt is written such that if the input sequence happens to implement IList<T>, it calls IList<T>’s indexer. Otherwise, it enumerates n times, and then returns the next element. ElementAt is not supported in LINQ to SQL or EF.
DefaultIfEmpty
DefaultIfEmpty returns a sequence containing a single element whose value is default(TSource) if the input sequence has no elements. Otherwise it returns the input sequence unchanged. This is used in writing flat outer joins: see the earlier sections Outer joins with SelectMany andFlat outer joins.
Aggregation Methods
IEnumerable<TSource>→ scalar
Method |
Description |
SQL equivalents |
Count, LongCount |
Returns the number of elements in the input sequence, optionally satisfying a predicate |
COUNT (...) |
Min, Max |
Returns the smallest or largest element in the sequence |
MIN (...), MAX (...) |
Sum, Average |
Calculates a numeric sum or average over elements in the sequence |
SUM (...), AVG (...) |
Aggregate |
Performs a custom aggregation |
Exception thrown |
Count and LongCount
Argument |
Type |
Source sequence |
IEnumerable<TSource> |
Predicate (optional) |
TSource => bool |
Count simply enumerates over a sequence, returning the number of items:
int fullCount = new int[] { 5, 6, 7 }.Count(); // 3
The internal implementation of Enumerable.Count tests the input sequence to see whether it happens to implement ICollection<T>. If it does, it simply calls ICollection<T>.Count. Otherwise, it enumerates over every item, incrementing a counter.
You can optionally supply a predicate:
int digitCount = "pa55w0rd".Count (c => char.IsDigit (c)); // 3
LongCount does the same job as Count, but returns a 64-bit integer, allowing for sequences of greater than 2 billion elements.
Min and Max
Argument |
Type |
Source sequence |
IEnumerable<TSource> |
Result selector (optional) |
TSource => TResult |
Min and Max return the smallest or largest element from a sequence:
int[] numbers = { 28, 32, 14 };
int smallest = numbers.Min(); // 14;
int largest = numbers.Max(); // 32;
If you include a selector expression, each element is first projected:
int smallest = numbers.Max (n => n % 10); // 8;
A selector expression is mandatory if the items themselves are not intrinsically comparable—in other words, if they do not implement IComparable<T>:
Purchase runtimeError = dataContext.Purchases.Min (); // Error
decimal? lowestPrice = dataContext.Purchases.Min (p => p.Price); // OK
A selector expression determines not only how elements are compared, but also the final result. In the preceding example, the final result is a decimal value, not a purchase object. To get the cheapest purchase, you need a subquery:
Purchase cheapest = dataContext.Purchases
.Where (p => p.Price == dataContext.Purchases.Min (p2 => p2.Price))
.FirstOrDefault();
In this case, you could also formulate the query without an aggregation—using an OrderBy followed by FirstOrDefault.
Sum and Average
Argument |
Type |
Source sequence |
IEnumerable<TSource> |
Result selector (optional) |
TSource => TResult |
Sum and Average are aggregation operators that are used in a similar manner to Min and Max:
decimal[] numbers = { 3, 4, 8 };
decimal sumTotal = numbers.Sum(); // 15
decimal average = numbers.Average(); // 5 (mean value)
The following returns the total length of each of the strings in the names array:
int combinedLength = names.Sum (s => s.Length); // 19
Sum and Average are fairly restrictive in their typing. Their definitions are hard-wired to each of the numeric types (int, long, float, double, decimal, and their nullable versions). In contrast, Min and Max can operate directly on anything that implements IComparable<T>—such as a string, for instance.
Further, Average always returns either decimal or double, according to the following table:
Selector type |
Result type |
decimal |
decimal |
int, long, float, double |
double |
This means the following does not compile (“cannot convert double to int”):
int avg = new int[] { 3, 4 }.Average();
But this will compile:
double avg = new int[] { 3, 4 }.Average(); // 3.5
Average implicitly upscales the input values to avoid loss of precision. In this example, we averaged integers and got 3.5, without needing to resort to an input element cast:
double avg = numbers.Average (n => (double) n);
When querying a database, Sum and Average translate to the standard SQL aggregations. The following query returns customers whose average purchase was more than $500:
from c in dataContext.Customers
where c.Purchases.Average (p => p.Price) > 500
select c.Name;
Aggregate
Aggregate allows you to specify a custom accumulation algorithm for implementing unusual aggregations. Aggregate is not supported in LINQ to SQL or Entity Framework, and is somewhat specialized in its use cases. The following demonstrates how Aggregate can do the work ofSum:
int[] numbers = { 2, 3, 4 };
int sum = numbers.Aggregate (0, (total, n) => total + n); // 9
The first argument to Aggregate is the seed, from which accumulation starts. The second argument is an expression to update the accumulated value, given a fresh element. You can optionally supply a third argument to project the final result value from the accumulated value.
NOTE
Most problems for which Aggregate has been designed can be solved as easily with a foreach loop—and with more familiar syntax. The advantage of using Aggregate is that with large or complex aggregations, you can automatically parallelize the operation with PLINQ (see Chapter 23).
Unseeded aggregations
You can omit the seed value when calling Aggregate, in which case the first element becomes the implicit seed, and aggregation proceeds from the second element. Here’s the preceding example, unseeded:
int[] numbers = { 1, 2, 3 };
int sum = numbers.Aggregate ((total, n) => total + n); // 6
This gives the same result as before, but we’re actually doing a different calculation. Before, we were calculating 0+1+2+3; now we’re calculating 1+2+3. We can better illustrate the difference by multiplying instead of adding:
int[] numbers = { 1, 2, 3 };
int x = numbers.Aggregate (0, (prod, n) => prod * n); // 0*1*2*3 = 0
int y = numbers.Aggregate ( (prod, n) => prod * n); // 1*2*3 = 6
As we’ll see in Chapter 23, unseeded aggregations have the advantage of being parallelizable without requiring the use of special overloads. However, there are some traps with unseeded aggregations.
Traps with unseeded aggregations
The unseeded aggregation methods are intended for use with delegates that are commutative and associative. If used otherwise, the result is either unintuitive (with ordinary queries) or nondeterministic (in the case that you parallelize the query with PLINQ). For example, consider the following function:
(total, n) => total + n * n
This is neither commutative nor associative. (For example, 1+2*2 != 2+1*1). Let’s see what happens when we use it to sum the square of the numbers 2, 3, and 4:
int[] numbers = { 2, 3, 4 };
int sum = numbers.Aggregate ((total, n) => total + n * n); // 27
Instead of calculating:
2*2 + 3*3 + 4*4 // 29
it calculates:
2 + 3*3 + 4*4 // 27
We can fix this in a number of ways. First, we could include 0 as the first element:
int[] numbers = { 0, 2, 3, 4 };
Not only is this inelegant, but it will still give incorrect results if parallelized—because PLINQ leverages the function’s assumed associativity by selecting multiple elements as seeds. To illustrate, if we denote our aggregation function as follows:
f(total, n) => total + n * n
then LINQ to Objects would calculate this:
f(f(f(0, 2),3),4)
whereas PLINQ may do this:
f(f(0,2),f(3,4))
with the following result:
First partition: a = 0 + 2*2 (= 4)
Second partition: b = 3 + 4*4 (= 19)
Final result: a + b*b (= 365)
OR EVEN: b + a*a (= 35)
There are two good solutions. The first is to turn this into a seeded aggregation—with zero as the seed. The only complication is that with PLINQ, we’d need to use a special overload in order for the query not to execute sequentially (see Optimizing PLINQ in Chapter 23).
The second solution is to restructure the query such that the aggregation function is commutative and associative:
int sum = numbers.Select (n => n * n).Aggregate ((total, n) => total + n);
NOTE
Of course, in such simple scenarios you can (and should) use the Sum operator instead of Aggregate:
int sum = numbers.Sum (n => n * n);
You can actually go quite far just with Sum and Average. For instance, you can use Average to calculate a root-mean-square:
Math.Sqrt (numbers.Average (n => n * n))
and even standard deviation:
double mean = numbers.Average();
double sdev = Math.Sqrt (numbers.Average (n =>
{
double dif = n - mean;
return dif * dif;
}));
Both are safe, efficient and fully parallelizable. In Chapter 23, we’ll give a practical example of a custom aggregation that can’t be reduced to Sum or Average.
Quantifiers
IEnumerable<TSource>→ bool
Method |
Description |
SQL equivalents |
Contains |
Returns true if the input sequence contains the given element |
WHERE ... IN (...) |
Any |
Returns true if any elements satisfy the given predicate |
WHERE ... IN (...) |
All |
Returns true if all elements satisfy the given predicate |
WHERE (...) |
SequenceEqual |
Returns true if the second sequence has identical elements to the input sequence |
Contains and Any
The Contains method accepts an argument of type TSource; Any accepts an optional predicate.
Contains returns true if the given element is present:
bool hasAThree = new int[] { 2, 3, 4 }.Contains (3); // true;
Any returns true if the given expression is true for at least one element. We can rewrite the preceding query with Any as follows:
bool hasAThree = new int[] { 2, 3, 4 }.Any (n => n == 3); // true;
Any can do everything that Contains can do, and more:
bool hasABigNumber = new int[] { 2, 3, 4 }.Any (n => n > 10); // false;
Calling Any without a predicate returns true if the sequence has one or more elements. Here’s another way to write the preceding query:
bool hasABigNumber = new int[] { 2, 3, 4 }.Where (n => n > 10).Any();
Any is particularly useful in subqueries and is used often when querying databases, for example:
from c in dataContext.Customers
where c.Purchases.Any (p => p.Price > 1000)
select c
All and SequenceEqual
All returns true if all elements satisfy a predicate. The following returns customers whose purchases are less than $100:
dataContext.Customers.Where (c => c.Purchases.All (p => p.Price < 100));
SequenceEqual compares two sequences. To return true, each sequence must have identical elements, in the identical order.
Generation Methods
void→IEnumerable<TResult>
Method |
Description |
Empty |
Creates an empty sequence |
Repeat |
Creates a sequence of repeating elements |
Range |
Creates a sequence of integers |
Empty, Repeat, and Range are static (nonextension) methods that manufacture simple local sequences.
Empty
Empty manufactures an empty sequence and requires just a type argument:
foreach (string s in Enumerable.Empty<string>())
Console.Write (s); // <nothing>
In conjunction with the ?? operator, Empty does the reverse of DefaultIfEmpty. For example, suppose we have a jagged array of integers, and we want to get all the integers into a single flat list. The following SelectMany query fails if any of the inner arrays is null:
int[][] numbers =
{
new int[] { 1, 2, 3 },
new int[] { 4, 5, 6 },
null // this null makes the query below fail.
};
IEnumerable<int> flat = numbers.SelectMany (innerArray => innerArray);
Empty in conjunction with ?? fixes the problem:
IEnumerable<int> flat = numbers
.SelectMany (innerArray => innerArray ?? Enumerable.Empty <int>());
foreach (int i in flat)
Console.Write (i + " "); // 1 2 3 4 5 6
Range and Repeat
Range and Repeat work only with integers. Range accepts a starting index and count:
foreach (int i in Enumerable.Range (5, 3))
Console.Write (i + " "); // 5 6 7
Repeat accepts the number to repeat, and the number of iterations:
foreach (int i in Enumerable.Repeat (5, 3))
Console.Write (i + " "); // 5 5 5