Creating Domain-Specific Languages - Thinking in LINQ: Harnessing the power of functional programing in .NET applications (2014)

Thinking in LINQ: Harnessing the power of functional programing in .NET applications (2014)

Chapter 6. Creating Domain-Specific Languages

Every domain has a language that helps practitioners communicate their thoughts easily. When a chef instructs a helper to sauté vegetables, both parties know what the term means. However, software solutions created for several domains using general-purpose, high-level programming languages such as C# or Java often quickly become difficult to maintain. Even only a few months later, the original authors of these solutions may find it difficult to remember how the code works so they can fix bugs.

Thankfully, you can avoid such problems by creating expressive and succinct internal APIs. These API sets are also called internal, or embedded, domain-specific languages (DSLs). You can later extend these to develop an external DSL that subject matter experts who do not necessarily have programming experience in a high-level, general-purpose programming language can use. Subject matter experts usually appreciate the opportunity to express themselves with these DSLs. MATLAB is an example of an external DSL, which is unsuitable for general-purpose tasks. In contrast, LINQ is an example of an internal, or embedded, DSL.

In this chapter, the first few sections describe techniques for creating embedded DSLs that blend well with the syntax of the host language (C#, in this case). The later sections show how to use the open source LINQ API to create external DSLs that use the internal DSL created previously. The purpose of this chapter is to illustrate how to apply these techniques to a variety of problem domains for your own purposes.

The next section compares the classic style of coding to using a DSL.

6-1. Feel the Difference

Compare the side-by-side code blocks in Figure 6-1 and try to understand what they do.

image

Figure 6-1. DSLs can improve the readability of a program

The loop at the left finds Armstrong numbers between 1 and 1,000. The LINQ query on the right does the same thing. Which one do you think more closely approaches the textual definition of Armstrong numbers? In my opinion, the LINQ query is much cleaner and far more readable than the loop. Some of my colleagues have argued that if they were to start writing code like the LINQ query on the right, they would soon forget how to interpret the loops. You’ll have to decide which approach is better. But if you have the war wounds most veteran programmers eventually experience, you will probably agree that code readability counts.

You may have noticed that the Digits() and Cube() methods are not shown in Figure 6-1. DSLs are not magical. Using them will not let you skip the process of implementing the logic. But in this case, the implementations are elsewhere, wrapped in some classes. The next section shows how to create a mini DSL to find numbers such as Armstrong numbers.

6-2. Creating a Simple DSL for Mathematicians

A language is made up of vocabulary. A domain-specific language is no different. To create a DSL, you first need to identify the reasonable vocabulary set for that language. The language you’ll create in this section is intended to help mathematicians write expressive code to find numbers such as Armstrong numbers.

Problem

Create a set of vocabulary and the grammar that glues those vocabulary terms together.

Solution

You may recall from Chapter 2 that Armstrong numbers and other related numbers are denoted by functions that act on the digits of those numbers. So Digits must be a word in this language. You also want to declare all the functions that can be performed on digits as vocabulary terms, so users (mathematicians) can glue those functions together. So functions such as Sum, Cube, and Factorial have to be declared as vocabulary terms. Since this is going to be a DSL for finding interesting numbers such as Armstrong numbers, I have named it Armstrong.

In C#, you can define these vocabulary terms as extension methods. Consider how Microsoft implemented LINQ standard query operators as extension methods of IEnumerable<T>. Can you see a connection?

Add the class code shown in Listing 6-1 to a new LINQPad query tab. Select C# Program from the Language drop-down menu.

Listing 6-1. Caption Here

public static class IntEx
{
public static int Cube(this int number)
{
return number * number * number;
}
public static int Square(this int number)
{
return number * number;
}

public static IEnumerable<int> Digits(this int number)
{
return number.ToString().ToCharArray()
.Select (n => Convert.ToInt32(n.ToString()));
}
public static IEnumerable<int> ReverseDigits(this int number)
{
return number.Digits().Reverse();
}
public static IEnumerable<int> EvenDigits(this int number)
{
return number.ToString().ToCharArray()
.Where ((m,i) => i%2==0).Select (n => Convert.ToInt32(n.ToString()));
}
public static IEnumerable<int> OddDigits(this int number)
{
return number.ToString().ToCharArray()
.Where ((m,i) => i%2!=0).Select (n => Convert.ToInt32(n.ToString()));
}
public static bool Are(this IEnumerable<int> actualDigits, params int[] digits)
{
return actualDigits.SequenceEqual(digits);
}
public static IEnumerable<int> DigitsAt(this int number, params int[] indices)
{
var asString = number.ToString();
return indices.Select (i => Convert.ToInt32(asString[i].ToString()));
}
public static bool AreZero(this IEnumerable<int> digits)
{
return digits.All (d => d == 0);
}
public static int FormNumber(this IEnumerable<int> digits)
{
return digits.Select ((d,i) => d * (int)Math.Pow (10,digits.Count()-(i+1)))
.Aggregate ((a,b) => a + b);
}
public static IEnumerable<int> Factorial(this IEnumerable<int> digits)
{
foreach (var d in digits)
if (d == 0)
yield return 1;
else
yield return Enumerable.Range(1, d).Aggregate((a, b) => a * b);
}

public static int Product(this IEnumerable<int> digits)
{
return digits.Aggregate ((f,s) => f*s);
}

public static IEnumerable<int> Cube(this IEnumerable<int> digits)
{
return digits.Select (d => d * d * d);
}

public static IEnumerable<int> Square(this IEnumerable<int> digits)
{
return digits.Select (d => d * d);
}

public static IEnumerable<int> RaiseSelfToSelf(this IEnumerable<int> digits)
{
return digits.Select (d => (int) Math.Pow(d,d));
}

public static IEnumerable<int> IncrementalPower(this IEnumerable<int> digits)
{
return digits.Select ((d,i) => (int) Math.Pow(d,i));
}
}

Add the following code inside Main():

void Main()
{
Enumerable.Range(1,10000)
.Where ( n => n.Digits().Cube().Sum() == n )
.Dump("Armstrong Numbers");
}

Running this code results in the output shown in Figure 6-2.

image

Figure 6-2. Armstrong numbers found with the help of the Armstrong DSL

Some of the extension methods in the IntEx class might look trivial. However, they are important for creating an expressive language. Moreover, these methods are critical when you later want to expose this embedded DSL as an external DSL (outside the host language C#).

Now, let’s see how the new language helps define similar numbers. Figure 6-3 shows the definitions of some similar numbers from Wikipedia (en.wikipedia.org/wiki/Narcissistic_number).

image

Figure 6-3. Definitions for several numbers similar to Armstrong numbers

Using the newly designed Armstrong embedded DSL, you can find these numbers by writing expressive code, as shown in Listing 6-2. The code segments that use the methods in the DSL appear in bold. Now compare the definitions of these numbers (in Figure 6-3) to the code that finds them in Listing 6-2. This way, this embedded DSL will help users keep their code clean and professional. To run this code in LINQPad, set the Language drop-down menu to C# Statements.

Listing 6-2. Caption Here

Enumerable.Range(0,10000)
.Where (d => d.Digits().Sum().Cube() == d)
.Dump("Dudeney numbers");
Enumerable.Range(0,10000)
.Where (d => d.Digits().Factorial().Sum() == d)
.Dump("Factorions");
Enumerable.Range(0,10000)
.Where (d => d.Digits().Sum() * d.Digits().Product() == d)
.Dump("Sum Product numbers");

Figure 6-4 shows the result of running the code in Listing 6-2.

image

Figure 6-4. Result of the code written in embedded DSL

How It Works

Congratulations! You have just created a small embedded DSL. This language has a vocabulary described by using C# extension methods on the integer data type. Table 6-1 lists these methods.

Table 6-1. Methods in Armstrong

Function

Purpose

Digits

This returns an IEnumerable<int> with the digits of the given integer.

ReverseDigits

This returns an IEnumerable<int> with the digits of the given integer in reverse order.

EvenDigits

This returns an IEnumerable<int> with just the digits at the even indices of the given integer.

OddDigits

This returns an IEnumerable<int> with just the digits at the odd indices of the given integer.

Cube

There are two versions of this method. The first one operates on a list of digits, and the second one operates on an integer.

Square

Similar to Cube. There are two versions of this method. The first one operates on a list of digits, and the second operates on an integer.

Product

This method returns the product of the digits of the number.

Are

This is a handy synonym for the SequenceEqual method, just to make its purpose a little bit more understandable.

Factorial

This returns the factorials of the digits of the given number.

AreZero

This is a predicate. This returns true if all the given digits are zero.

FormNumber

This helper function helps create a number formed from the digits of the given number.

RaiseSelfToSelf

This function returns all the digits raised to the power of themselves.

IncrementalPower

This function returns all the digits raised to the power of their index in the list.

6-3. Testing Armstrong by Using NUnit

Most of the methods shown in the preceding section are simple and easy to understand. However, I thought it would be nice to have a small test suite for these methods to make their usage immediately evident.

Problem

Create a test suite for the methods in Recipe 6-1.

Solution

To create a test suite, I used the NUnit framework. So the first step is to go to the NUnit web site and get the latest appropriate version for you. I used the version available at http://launchpad.net/nunitv2/trunk/2.6.3/+download/NUnit-2.6.3.msi.

After you have installed the NUnit framework, follow these steps:

1.    Create a Class Library project and copy all the code for Armstrong and name it Armstrong.

2.    Add another class library project called ArmstrongTest. Add NUnit references there along with the Armstrong reference, as shown in Figure 6-5.

image

Figure 6-5. Adding references to NUnit and the Armstrong project to the test project ArmstrongTest

3.    Change the name of the class from Class1.cs to ArmstrongTest.cs and add the code in Listing 6-3 to that class file.

Listing 6-3. Caption Here

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using NUnit.Framework;
using Armstrong;

namespace ArmstrongTest
{
[TestFixture]
public class ArmstrongTest
{
[Test]
public void TestFormNumber()
{
Assert.AreEqual(24, 12345.OddDigits().FormNumber());
Assert.AreEqual(135, 12345.EvenDigits().FormNumber());
}
[Test]
public void TestDigitsAt()
{
Assert.IsTrue(12345.DigitsAt(1, 3).SequenceEqual(new int[] { 2, 4 }));
}
[Test]
public void TestEvenDigits()
{
Assert.IsTrue(12345.EvenDigits().SequenceEqual(new int[] { 1, 3, 5 }));
}
[Test]
public void TestOddDigits()
{
Assert.IsTrue(12345.OddDigits().SequenceEqual(new int[] { 2, 4 }));
}
[Test]
public void TestFactorial()
{
Assert.AreEqual(145, 145.Digits().Factorial().Sum());
}
[Test]
public void TestRaiseToSelf()
{
Assert.AreEqual(32, 123.Digits().RaiseSelfToSelf().Sum());
}
[Test]
public void TestIncrementalPower()
{
Assert.AreEqual(12, 123.Digits().IncrementalPower().Sum());
}
[Test]
public void TestProducts()
{
Assert.AreEqual(6, 123.Digits().Product());
}
[Test]
public void TestDigits()
{
Assert.IsTrue(1234.Digits().SequenceEqual(new int[]{1, 2, 3, 4}));
}
[Test]
public void TestArmstrongNumber()
{
Assert.IsTrue(153.Digits().Cube().Sum() == 153);
}
[Test]
public void TestDudeney()
{
Assert.IsTrue(512.Digits().Sum().Cube() == 512);
}
}
}

Image Note  Make sure to add a reference to Armstrong.dll to the reference of this test project. Otherwise, it won’t compile.

4.    After building the test project successfully, open the NUnit GUI to load the tests and run them. To do that, that you must locate ArmstrongTest.dll, as shown in Figure 6-6.

image

Figure 6-6. Locate the ArmstrongTest.dll from wherever you have saved the project

5.    When you locate the ArmstrongTest.dll, all the tests will load, as shown in Figure 6-7.

image

Figure 6-7. All the Armstrong tests loaded in the NUnit GUI

6.    Click the Run button. Every test should pass, and you should see a green bar.

Feel free to do your own experiments and see how the results differ. In fact, I recommend that you make changes to the tests or to the Armstrong code so it fails some of the tests.

How It Works

The code in this section works by building a series of assertion tests that check the functions in Armstrong.dll, thus verifying that the code works as expected.

6-4. Exposing Armstrong as an External DSL

As designed so far, the Armstrong language is good for boosting developer productivity, but it is still unusable by people who don’t code in C# or some other .NET language. For the language to serve its purpose, you need to expose it in a form that accepts free-form input from users and generates the appropriate queries from that input. In this section, you will see how to expose the functionality of the language by using an external English-like language with Armstrong keywords embedded. This will not be hard for users, because the Armstrong keywords match the normal English vocabulary of mathematicians, so they should feel at home using it.

Again, comparing the internal DSL with the typical mathematical vocabulary should help illustrate the difference. Figure 6-8 shows both.

image

Figure 6-8. The difference between the external and internal representation of Armstrong

While a C# programmer could write the first version shown in Figure 6-8 by using the internal representation of Armstrong with host language support from C#, a nonprogrammer domain expert (mathematicians, in this case) would likely find the free-flowing form of Armstrong expressed in the lower part of Figure 6-8 far simpler and more intuitive. That’s why DSLs are important. I have highlighted the Armstrong tokens in the text phrase in bold.

Problem

Expose Armstrong as a free-flowing, English-like, external DSL.

Solution

The strategy is to generate Armstrong code from an English-like syntax. The generated Armstrong query must be able to run on a range of input variables. To run the generated LINQ statement, you need a LINQ compiler that can take a LINQ statement as a string and run it against a given input range. For this, I have used the open source LINQ compiler available from CodePlex at http://linqcompiler.codeplex.com/.

Here are the steps you need to follow:

1.    Download the LINQ compiler and place the compiler DLL (Evaluant.LINQ.Compiler.dll) in a new folder.

2.    In LINQPad, add the IntEx class created earlier in this chapter to the query. Change the query type to C# Program. Add the following code in Main() and outside of Main() as part of the executing class:

private static string SanitizeBraces(string generatedStatement)
{
int gap = generatedStatement.ToCharArray().Count(c => c == '(') -
generatedStatement.ToCharArray().Count(c => c == ')');
if (gap == 0)
return generatedStatement;
else
return generatedStatement + new string(')', gap);
}
private static string GenerateArmStrongStatement(List<string> tokens)
{
Dictionary<string, string> mapping = new Dictionary<string, string>();
mapping.Add("*", "*");
mapping.Add("times", "*");
mapping.Add("(", ")");
mapping.Add(")", "(");
mapping.Add("are-same", ".IsSame()");
mapping.Add("==", "==");
mapping.Add("proper-divisors", ".ProperDivisors()");
mapping.Add("even-digits", ".EvenDigits()");
mapping.Add("odd-digits", ".OddDigits()");
mapping.Add("number", "n");
mapping.Add("square", ".Square()");
mapping.Add("product", ".Product()");
mapping.Add("is", "==");
mapping.Add("!=", "!=");
mapping.Add("+", "+");
mapping.Add("-", "-");
mapping.Add("and", "&&");
mapping.Add("or", "||");
mapping.Add("/", "/");
mapping.Add(">", "<");
mapping.Add("<", ">");
mapping.Add("<=", ">=");
mapping.Add(">=", "<=");
mapping.Add("divided-by", "/");
mapping.Add("are", ".Are(");
mapping.Add("digits", ".Digits()");
mapping.Add("reverse-digits",".ReverseDigits()");
mapping.Add("cube", ".Cube()");
mapping.Add("factorial", ".Factorial()");
mapping.Add("sum", ".Sum()");

//Add all normal LINQ operators
mapping.Add("average", ".Average()");
mapping.Add("maximum", ".Max()");
mapping.Add("minimum", ".Min()");
mapping.Add("digits-at", ".DigitsAt(");

StringBuilder armstrongBuilder = new StringBuilder();
foreach (string to in tokens)
{
if (mapping.ContainsKey(to))
armstrongBuilder.Append(mapping[to]);
if (to.ToCharArray().All(t => Char.IsNumber(t) || t == '.'))
armstrongBuilder.Append(to);
}

return SanitizeBraces("input.Where ( n => " + armstrongBuilder.ToString() + ")");
}

void Main()
{
do
{
var inputs = Enumerable.Range(1, 10000);
Console.WriteLine("Armstrong >>");
string line = Console.ReadLine()
.Replace("(", "( ")
.Replace(")", " )");
string statement = GenerateArmStrongStatement(GetTokens(line));

LinqCompiler lc = new LinqCompiler(statement);
lc.ExternalAssemblies.Add(typeof(IntEx).Assembly);
lc.ExternalAssemblies.Add(typeof(MathEx).Assembly);
lc.AddSource("input", inputs);
line.Dump("Armstrong Query Expressed in Plain English");
statement.Dump("Generated LINQ Query");
lc.Evaluate().Dump("Answers");

} while (true);
}

3.    This query uses the LINQ compiler. So you need to point LINQPad to that DLL. To do that, press F4 to load the Query Properties window. Then provide the path to this DLL, as shown in Figure 6-9. In my case, it was available in the C:\ drive. For you, it might be somewhere else.

image

Figure 6-9. Add Evaluant LINQ Compiler to LINQPad

4.    Navigate to the Additional Namespaces Imports tab and type Evaluant.Linq.Compiler. This tells LINQPad to use the functionality available in the LINQ Compiler project.

Now you are ready to test the “experimental” version of Armstrong. If you run the query from LINQPad, you will see the output shown in Figure 6-10.

image

Figure 6-10. LINQPad is waiting on user input

The black rectangle at the bottom of Figure 6-10 is where you have to type the Armstrong query. Table 6-2 lists a few examples that you can try.

Image Tip  Remember to type queries in lowercase.

Table 6-2. Armstrong Statements Illustrating How to Use the External DSL

Example Armstrong Statement That You Can Type

What It Does

sum of the cube of the digits of the number is the number itself

Finds Armstrong numbers.

cube of the sum of the digits of the number is the number

Finds Dudeney numbers.

sum of the factorial of the digits of the number is the number

Finds factorions.

sum of the odd-digits of the number is 10

Finds all numbers that match this predicate.

sum of the odd-digits of the number is equal to the sum of the even-digits of the number and number >= 200 and number <= 1000

Finds all numbers that match the given condition.

even-digits of the number are the same as that of the odd-digits of the number

Finds all numbers where even and odd digits are the same.

digits of the number are the same as that of the reverse-digits of the number

Finds all numbers that are palindromic. The digits of these numbers read the same both ways.

Figure 6-11 shows how the interface looked when I started typing the command in the black rectangle provided.

image

Figure 6-11. Typing an Armstrong query in the LINQPad user input box

Figure 6-12 shows the result after entering the command.

image

Figure 6-12. The result of the Armstrong query (between 1 and 10,000)

How It Works

The expression entered by a user at the console is parsed. Based on the words that match Armstrong keywords (such as sumcube, and digits), it generates a LINQ query. Later, the LINQ query is passed to the LINQ compiler, which evaluates the statement.

Note that the keywords in the statement and in the LINQ query appear in reverse order (most of the time). For example, consider the query in Figure 6-11. The keywords sumcube, and digits occur in sequence in the English-like statement. However, for LINQ to work, these must be glued together in the opposite direction. In other words, the digits of the number have to be extracted first. Then those digits have to be projected, using their cube values, and finally, the cubed digits must be added together by calling sum. In order to achieve this, the code uses a stack. I have put together a small demo of this on YouTube at www.youtube.com/watch?v=x0jbfDq8-Zk.

Here’s the take-home lesson from this experiment: After you have a good embedded DSL, it is easy to expose it as an external language. You can write an interpreter like this console to translate that external DSL to your embedded DSL code and then execute it to show results to the users.

6-5. Cloning Handy F# Functions by Using LINQ

LINQ standard query operators are comprehensive enough that you can craft almost any query by using them. However, sometimes the resulting code looks ugly, and that can lead to maintenance nightmares. Although you can already use F# methods in C#, the point of this section is to show you how to use standard LINQ operators to craft any arbitrary operator that other languages (in this case, F#) offer. I have picked ones that are absent conceptually from the LINQ standard set of operators, as shown in Table 6-3.

Table 6-3. F# Operators Missing from LINQ

Operator

Purpose

Iterate

Similar to ForEach on IList<T>, this operator iterates over a collection and performs an action.

Exists2

Similar to Any but works on two consecutive elements at a time.

ForAll2

Similar to All but works on two consecutive elements at a time.

Zip3

Combines three collections into a list of tuples of three elements.

FindIndex

Returns the index of the given element in the collection, if found; otherwise, returns -1.

Pairwise

Returns a list of key/value pairs, where the keys represent the first element and the values represent the second element of consecutive-pair collections.

Scan

Generates a sequence of numbers by performing a set of operations.

ScanBack

Same as Scan, but this time the list of operations is read backward.

IntersectMany

Same as Intersect, except this operator works on a list of elements instead of just two.

UnionMany

Same as Union, but this operator works on a list of elements instead of just two.

Partition

Partitions the given collection into two parts. The first part is generated from elements for which the given predicate returns true, and the second part is formed from elements for which the given predicate returns false.

Scan

Generates a series of numbers from a given seed number and a list of functions that depicts the step calculations.

ScanBack

Same as Scan, but in this case the functions are reversed and applied.

Problem

Clone these F# methods so they can be used by any collection in a generic way, using LINQ.

Solution

Change the query type in LINQPad to C# Program, and then add the class FSharpEx shown in Listing 6-4.

Listing 6-4. Caption Here

public static class FSharpEx
{
/// <summary>
/// This method generates a list of numbers from a given seed number
/// and a list of functions that are used to generate the next number
/// one step at a time.
/// </summary>
/// <typeparam name="T">The type of the seed and the function arguments</typeparam>
/// <param name="x0">The seed value</param>
/// <param name="projectors">The step descriptions in terms of Functions</param>
/// <returns>A list of generated elements</returns>
public static IEnumerable<T> Scan<T>(this T x0, IEnumerable<Func<T, T>> projectors)
where T : IEquatable<T>
{
List<T> values = new List<T>();
values.Add(x0);
foreach (var f in projectors)
{
values.Add(f.Invoke(values.Last()));
}
return values.AsEnumerable();
}
/// <summary>
/// This is same as Scan just that the functions provided are used in reverse order
/// while generating the elements
/// unlike Scan where the sequence of the functions are used as is.
/// </summary>
/// <typeparam name="T">The type of the collection and the seed value</typeparam>
/// <param name="x0">The seed value</param>
/// <param name="projectors">The step descriptions</param>
/// <returns>A list of generated elements</returns>
public static IEnumerable<T> ScanBack<T>(this T x0, IEnumerable<Func<T, T>> projectors)
where T : IEquatable<T>
{
List<T> values = new List<T>();
values.Add(x0);
foreach (var f in projectors.Reverse())
{
values.Add(f.Invoke(values.Last()));
}
return values.AsEnumerable();
}
/// <summary>
/// This method partitions the given collection into two parts.
/// The first part contains elements for which the predicate returns true.
/// The other part contains elements for which the predicate returns false.
/// </summary>
/// <typeparam name="T">The type of the collection</typeparam>
/// <param name="collection">The collection</param>
/// <param name="predicate">The predicate.</param>
/// <returns>A tuple with two ranges. The first range has the elements for
/// which the predicate returns true and the second part returns elements
/// for which the predicate returns false.</returns>
public static Tuple<IEnumerable<T>, IEnumerable<T>> Partition<T>(
this IEnumerable<T> collection, Func<T, bool> predicate)
{
return new Tuple<IEnumerable<T>, IEnumerable<T>>(
collection.Where(c => predicate.Invoke(c)),
collection.Where(c => !predicate.Invoke(c)));

}
/// <summary>
/// Applies the given action for all elements of the given collection
/// </summary>
/// <typeparam name="T">The type of the collection</typeparam>
/// <param name="collection">The collection</param>
/// <param name="action">The action to be performed</param>
public static void Iterate<T>(this IEnumerable<T> collection, Action<T> action)
{
foreach (var v in collection)
action.Invoke(v);
}
/// <summary>
/// This method wraps three collections into one.
/// </summary>
/// <typeparam name="T1">The type of the first collection</typeparam>
/// <typeparam name="T2">The type of the second collection</typeparam>
/// <typeparam name="T3">The type of the third collection</typeparam>
/// <param name="first">The first collection</param>
/// <param name="second">The second collection</param>
/// <param name="third">The third/last collection</param>
/// <returns>A list of tuples where the items of the tuples are picked from the first,
//  second, and third collection,
/// respectively.</returns>
public static IEnumerable<Tuple<T1, T2, T3>> Zip3<T1, T2, T3>(IEnumerable<T1> first,
IEnumerable<T2> second,
IEnumerable<T3> third)
{
int smallest = (new List<int>() { first.Count(), second.Count(),
third.Count() }).Min();
for (int i = 0; i < smallest; i++)
yield return new Tuple<T1, T2, T3>(first.ElementAt(i), second.ElementAt(i),
third.ElementAt(i));
}
/// <summary>
/// Returns the index of the given item in the given collection
/// </summary>
/// <typeparam name="T">The type of the collection</typeparam>
/// <param name="collection">The collection</param>
/// <param name="predicate">The predicate to be used to search the given item</param>
/// <returns>Returns the index of the given element in the collection, else returns -1
/// if not found.</returns>
public static int FindIndex<T>(this IEnumerable<T> collection, Func<T, bool> predicate)
{
try
{
return collection.Select((c, i) => new KeyValuePair<int, bool>(i,
predicate.Invoke(c)))
.First(c => c.Value == true).Key;
}
catch (InvalidOperationException ex)
{
return -1;
}
}
/// <summary>
/// Returns a list of consecutive items as a list of key/value pairs
/// </summary>
/// <typeparam name="T">The type of the input collection</typeparam>
/// <param name="collection">The collection</param>
/// <returns>A list of key/alue pairs</returns>
public static IEnumerable<KeyValuePair<T, T>> Pairwise<T>(
this IEnumerable<T> collection)
{
return collection.Zip(collection.Skip(1), (a, b) => new KeyValuePair<T, T>(a, b));
}
/// <summary>
/// Checks whether there is a pair of consecutive entries that matches
/// the given condition
/// </summary>
/// <typeparam name="T">The type of the collection</typeparam>
/// <param name="collection">The collection</param>
/// <param name="predicate">The predicate to use</param>
/// <returns>True if such a pair exists that matches the given predicate pairwise
/// else returns false</returns>
public static bool Exists2<T>(this IEnumerable<T> collection,
Func<T, T, bool> predicate)
{
return collection.Zip(collection.Skip(1), (a, b) =>
predicate.Invoke(a, b)).Any(c => c == true);
}
/// <summary>
/// Checks whether all pairwise items (taken 2 at a time) from the given collection
/// matches the predicate or not
/// </summary>
/// <typeparam name="T">The type of the collection</typeparam>
/// <param name="collection">The collection</param>
/// <param name="predicate">The predicate to run against all pairwise coupled
/// items.</param>
/// <returns></returns>
public static bool ForAll2<T>(this IEnumerable<T> collection,
Func<T, T, bool> predicate)
{
return collection.Zip(collection.Skip(1), (a, b) => predicate.Invoke(a, b))
.All(c => c == true);
}
/// <summary>
/// Finds intersection of several collections
/// </summary>
/// <typeparam name="T">type of these collections</typeparam>
/// <param name="sets">all collections</param>
/// <returns>A list with all the elements that appear in the intersection of
/// all these collections</returns>
public static IEnumerable<T> IntersectMany<T>(this IEnumerable<IEnumerable<T>> sets)
where T : IComparable
{
HashSet<T> temp = new HashSet<T>(sets.ElementAt(0));
sets.ToList().ForEach(z => temp = new HashSet<T>(z.Intersect(temp)));
return temp;
}
/// <summary>
/// Finds the union of several collections.
/// </summary>
/// <typeparam name="T">The type of these collections</typeparam>
/// <param name="sets">All the collections, not just sets</param>
/// <returns>A collection of elements with all the elements in the total union</returns>
public static IEnumerable<T> UnionMany<T>(this IEnumerable<IEnumerable<T>> sets)
where T : IComparable
{
HashSet<T> allValues = new HashSet<T>();
sets.SelectMany(s => s).ToList().ForEach(z => allValues.Add(z));
return allValues;
}
}

Also add the following code in Main():

void Main()
{
int x = 10;
List<Func<int,int>> steps = new List<Func<int,int>>();
steps.Add( a => a + 1);
steps.Add( a => a + 3);
steps.Add( a => a - 4);
steps.Add( a => 2*a - 1);

x.Scan(steps).Dump("Scan");
x.ScanBack(steps).Dump("Scanned Back");
FSharpEx.Zip3(x.Scan(steps),x.ScanBack(steps),x.Scan(steps.Concat(steps.Skip(1))))

.Dump("Zipped");
x.Scan(steps).Iterate(a => Console.WriteLine("Score is " + a));
int[] series = {1,2,3,4,5,6,7,8,9,10};
//Check whether the given series is in AP
bool isAPSeries = series.Pairwise()
.Select (s => new {First = s.Key, Second = s.Value,
Difference = s.Value - s.Key })
.All (s => s.Difference == series[1]-series[0]);

isAPSeries.Dump("isAP using Pairwise");

series.ForAll2((a,b) => b - a == series[1] - series[0])
.Dump("isAP using ForAll2");

series.Exists2((a,b) => a + b >= 100 && a + b <= 200)
.Dump("Is there any such couple of elements");

series.Pairwise().Dump("Items picked Pairwise");

series.Partition(a => a % 2 == 0).Dump("Partitioned");

int[] theseOnes = {1,3,52,2,1};
int[] thatOnes = {4,5,2,1,3,4};
int[] otherOnes = {2,3,1,1,3,14};
FSharpEx.IntersectMany((new List<int[]>(){theseOnes, thatOnes, otherOnes }))
.Dump("Intersect Many");
FSharpEx.UnionMany((new List<int[]>(){theseOnes, thatOnes, otherOnes }))
.Dump("Union Many");

}

When I ran the preceding code, I got the results shown in Figure 6-13. I encourage you to change the values in the collections and observe the effects in the results. I have moved the values to the right in the interest of space.

image

Figure 6-13. The result of several calculations done with F# extension methods

How It Works

Because there are so many outputs, I numbered them in Figure 6-13. Here are some explanations of the results.

Image Tip  The following numbered items correspond to the numbered items in Figure 6-13.

1.    The rules to increment the given number (10, in this case) are a => a + 1 and so on. So the number changes from 10 to 11 and so on, according to the rules.

2.    This case uses the same functions as item 1, but with the order of the functions applied on the seed reversed.

3.    Three lists (all are of integer types in this case) are zipped together to form a list of tuples. Notice the result type in the header of the result grid.

4.    Because the ForEach operator can be used to perform an action over all the elements of the given collection, there is a requirement to project any collection of <T> to a List<T> just to use this functionality. However, converting any collection to a list is computationally expensive. Therefore, to reduce the projection requirement, you can use the Iteration() operator. The result (in item 4) shows the result of running an iteration over the given collection.

5.    The Pairwise() operator returns the elements wrapped into pairs—consecutive pairs, in this case. If you want sorted pairs, you can sort the collection first and then use the Pairwise() operator. In this example, Pairwise() is used to check whether the given sequence on which Pairwise() is called is an arithmetic progression (AP).

ForAll2() is used to find whether the given sequence is an AP. It would have been sufficient to check whether the values obtained from performing the subtraction between each pair of items is the same as that of the first and second element of the given collection.

6.    This grid shows the result of calling Pairwise().

7.    This grid shows the result of calling Partition(). When a partition is complete, it returns two sets of values based on the predicate provided. For this example, I have partitioned the given list into even and odd members.

8.    This grid shows the intersection and union of the collections.

6-6. Lazily Generating Items from a Recurrence Relation

There are several definitions of recurrence relations. For the purposes of this section, recurrence relations are those where the nth variable of the sequence is described by the previous elements thus far. For example, the Fibonacci series follows this logic:

F[n] = F[n-1] + F[n-2]

Therefore, any nth item in the Fibonacci series is the sum of the two preceding items.

Finding such numbers using recursion can lead to a stack overflow error. If the sequences are generated lazily—in other words, generated only when elements are required—then a stack overflow error won’t occur. To do that, you must store the intermediate results of the recurrence relation in a collection. For example, when Fibonacci numbers get evaluated, the elements identified so far can be stored in a collection. That way, to generate the next number, only the last two items need to be added together. This technique is known as memoization.

Problem

Create an embedded DSL that helps with the creation of recurrence relations lazily.

Solution

Paste the following class into LINQPad in a new query. Set the query type to C# Program. Make sure that this class is outside the Main() method.

public static class SequenceEx
{
public static IEnumerable<T> StartWith<T>(params T[] seeds)
{
return new List<T>(seeds).AsEnumerable();
}

public static IEnumerable<T> ThenFollow<T>(this IEnumerable<T> thisSequence,
Func<T,T,T> rule) where T:IEquatable<T>
{
while(true)
{
T last = thisSequence.ElementAt(thisSequence.Count () - 1);
T lastButOne = thisSequence.ElementAt(thisSequence.Count () - 2);

thisSequence = thisSequence
.Concat((new List<T>()
{rule.Invoke(last,lastButOne)}).AsEnumerable());
yield return rule.Invoke(last,lastButOne);

}
}
public static IEnumerable<T> ThenFollow<T>(this IEnumerable<T> thisSequence, Func<T,T> rule)
where T:IEquatable<T>
{
while(true)
{
T last = thisSequence.ElementAt(thisSequence.Count () - 1);

thisSequence = thisSequence.Concat((new List<T>()
{rule.Invoke(last)}).AsEnumerable());
yield return rule.Invoke(last);

}
}
}

Now add the following code in the Main() method:

void Main()
{
Func<long,long,long> A015531Rule = (x,y) =>  4 *x + 5*y;
Func<long,long,long> fibonacciRule = (x,y) => x + y;
Func<double,double> arbitraryRule = (x) => 1/(x + 1/x);
SequenceEx.StartWith<long>(0,1)
.ThenFollow(A015531Rule)
.Take(5)
.Dump("A015531");

SequenceEx.StartWith<long>(1,1)
.ThenFollow(fibonacciRule)
.Take(5)
.Dump("First few Fibonacci Numbers");

SequenceEx.StartWith(1.0)
.ThenFollow(arbitraryRule)
.Take(5)
.Dump("Arbitrary Sequence");
}

When you run it, this code generates the output shown in Figure 6-14.

image

Figure 6-14. Some recurrence relations

How It Works

The method Startwith() generates a list with the seed values. There can be as many seed values as needed, because the values are passed as a params array. Once this initial list of numbers (seed values) is created, the rule can use them to expand the list. The ThenFollow() methodtakes a Func that operates on the last (or the last and next-to-last) elements of the collection thus far and returns the new element to be added in the collection. Thus the collection grows lazily. It returns only as many elements as were requested. (This example uses Take to find the first five elements in each case.)

If you are wondering why this example is titled A015531, it’s because that’s the name of an interesting recurrence relation from the On-Line Encyclopedia of Integer Sequences (OEIS). You can see the definition at https://oeis.org/A015531.

Summary

This chapter discussed how to use LINQ to design and use domain-specific languages. I kept the examples and the designed DSLs to a bare minimum so you could see the power of DSLs using only a small amount of code. DSLs can make the lives of programmers a lot easier. You may already be thinking of designing your own DSLs for your own problem domains. As a general strategy, remember that the workflow for designing a DSL is to identify the vocabulary first, and then identify the grammar that works well to glue the vocabulary terms together.

This chapter discussed only DSLs that are suited to the functional programming capabilities offered by LINQ. But it’s worth noting that other DSLs exist. They’re not necessarily functional but are still useful. Consider the Starbucks DSL (www.fssnip.net/9w), which lets you find the cost of a cup of coffee. As a useful exercise, try cloning that DSL to C#, using extension methods on integers and doubles.