BEYOND JAVA TO SCALA - Learn Scala for Java Developers (2015)

Learn Scala for Java Developers (2015)

III. BEYOND JAVA TO SCALA

This part of the book is all about the differences between Scala and Java. There are plenty of language features in Scala that don’t have an obvious analog in Java. In this part, we’ll take a closer look at some of those and explore what Scala can give us over Java.

Specifically, we’ll explore some of the language features that make writing Scala more expressive and we’ll look at some of the more functional programming idioms that Scala is so well know for.

Expressive Scala

Scala offers several features that make writing code more concise. As well as some we’ve already seen, it provides mechanisms to:

· Make methods look like functions using the special case apply method.

· Provide default behaviour for the assignment operator using a special case update method.

· Make regular method calls look like language structures, which in effect means you can define your own control structures.

Scala also offers pattern matching; we’ll look at the unapply method and its role in pattern matching.

Functional Programming Idioms

We’ll also look at some of functional programming aspects of Scala.

· Built-in methods for mapping values (map, flatMap).

· What monads are and why you should care.

· The Option class as a way of avoiding null checks.

· Chaining monad calls.

· For comprehensions and how they work under the hood.

This only scratches the surface of functional programming but I hope that this part of the book will give you a useful head start when it comes to Scala and functional programming.

Faking Function Calls

Scala provides mechanisms to make method calls look like regular function calls. It uses special case apply and update methods to allow a kind of shorthand call notation that can reduce the clutter of your code.

The apply Method

The apply method provides a shorthand way of calling a method on a class. So, as we saw, you can use them as factory-style creation methods, where given some class such as our old friend Customer:

class Customer(name: String, address: String)

…you can add an apply method to its companion object.

object Customer {

def apply(name: String, address: String) = new Customer(name, address)

}

You can then either call the method directly, or drop the apply part, at which point Scala will look for an apply method that matches your argument list and call it.

Customer.apply("Bob Fossil", "1 London Road")

Customer("Rob Randal", "14 The Arches")

You can also have multiple apply methods. For example, we could create another method to default the address field of our customer.

def apply(name: String) = new Customer(name, "No known address")

Customer("Cuthbert Colbert")

You don’t have to use apply methods as factory methods though. Most people end up using them to make their class APIs more succinct. This is key to how Scala can help make APIs more expressive. If it makes sense that a default behaviour of your class is to create an instance, fine, but you can also make other methods look like function calls using apply.

So far, we’ve been using an apply method on a singleton object (object Customer) and dropping the apply, but you can have apply methods on a class and call them on a instance variable.

For example, we could create a class called Adder and call the apply method on an instance to add two numbers together.

val add = new Adder()

add.apply(1, 3)

But we can just as easily drop it and it’ll look like we’re calling a function even though we’re actually calling a method on an instance variable.

// scala

val add = new Adder()

add(1, 3)

Another example is accessing array values. Suppose we have an array of Roman numerals.

val numerals = Array("I", "II", "III", "IV", "V", "VI", "VII")

To access the array using an index, the syntax is to use parentheses rather than square brackets.

numerals(5) // yields "VI'

So using the index in a loop, we could do something like this to print the entire array:

for (i <- 0 to numerals.length - 1)

println(i + " = " + numerals(i))

What’s interesting here is that there is an apply method on array that takes an Int. So we could have written it like this:

numerals.apply(5) // yields "VI'

for (i <- 0 to numerals.length - 1)

println(i + " = " + numerals.apply(i))

What looks like language syntax to access an array is actually just a regular method call. Scala fakes it.

The update Method

Assignment works in just the same way. For example, numerals(2) = "ii" actually calls a special method called update on the Array class (def update(i: Int, x: T)).

numerals(2) = "ii"

If Scala sees the assignment operator and can find an update method with appropriate arguments, it translates the assignment to a method call.

We can apply this idea to our own classes to make an API feel more like language syntax. Let’s say we’re in the business of telephony and part of that business is to maintain a directory of customer telephone numbers.

We can create a collection class to contain our directory, and initialise it to hold the telephone numbers of the four musketeers, like this:

class Directory {

val numbers = scala.collection.mutable.Map(

"Athos" -> "7781 456782",

"Aramis" -> "7781 823422",

"Porthos" -> "1471 342383",

"D`Artagnan" -> "7715 632982"

)

}

If we decide that the shorthand or default behaviour of the directory should be to return the telephone number of a customer, we can implement the apply method as follows:

def apply(name: String) = {

numbers.get(name)

}

That way, after creating a instance of our directory, we can print Athos’s number like this:

val yellowPages = new Directory()

println("Athos's telephone number : " + yellowPages("Athos"))

Then if we want to update a number, we could implement an updating method and call it directly. Scala’s assignment shorthand means that if we actually name our method update, we can use the assignment operator and it will call the update method for us.

So, we add an update method:

def update(name: String, number: String) = {

numbers.update(name, number)

}

Then we can call it to update a number like this:

yellowPages.update("Athos", "Unlisted")

Taking advantage of the shorthand notation, you can also use assignment.

yellowPages("Athos") = "Unlisted"

Multiple update Methods

We could also add a second update method, this time with an Int as the first argument.

def update(areaCode: Int, newAreaCode: String) = {

???

}

Let’s say we want it to update an area code across all entries. We could enumerate each entry to work out which numbers start with the area code from the first argument. For any that match, we go back to the original map and update the entry.

def update(areaCode: Int, newAreaCode: String) = {

numbers.foreach(entry => {

if (entry._2.startsWith(areaCode.toString))

numbers(entry._1) = entry._2.replace(areaCode.toString, newAreaCode)

})

}

The _1 and _2 are Scala notation for accessing what’s called a tuple. It’s a simple data structure that we’re using to treat what, in our case, would be a Map.Entry in Java as a single variable. The _1 and _2 are method calls that let us access the key and value respectively. Tuples are actually more general purpose than this and not just used for maps. We’re using a tuple of two elements (a Tuple2) but you can have tuples with up to twenty-two elements (Tuple22).

We can call the new update method using the shorthand assignment syntax like this:

object DirectoryExampleAlternativeUpdateMethod extends App {

val yellowPages = new Directory

println(yellowPages)

yellowPages(7781) = "7555"

println(yellowPages)

}

The outcome of this is that both Athos and Aramis will have their area codes updated.

Multiple Arguments to update

You can have as many arguments in the update method as you like but only the last will be used as the updated value. This makes sense, as you can only have one value to the right of an assignment operator.

The rest of the argument list is used to select the appropriate update methods. So if you had another method with three arguments (areaCode, anotherArgument and newAreaCode):

def update(areaCode: Int, another: String, newAreaCode: String) = ???

…the types would be used to work out which update method should be called on assignment.

yellowPages(7998) = "7668"

yellowPages(7998, "another argument") = "???"

Summary

We’ve seen more about the apply method in this chapter; how you don’t just use it for factory-style creation methods but for building rich APIs. You can make client code more concise by making method calls look like function calls.

We also saw how the related update method works and in the same way how we can write APIs that take advantage of the assignment operator and implement custom update behaviour.

Faking Language Constructs

Scala allows you to write your code in such a way as to give the impression that you’re working with native language constructs, when really you’re just working with regular methods.

This chapter will cover:

· How Scala allows you to use curly braces instead of regular parentheses when calling methods.

· How Scala supports higher-order functions: functions that take functions as arguments and return functions as results.

· How Scala supports currying out of the box.

These things don’t sound that impressive, but combined they allow for a surprising amount of flexibility. We’ll see how these techniques can help you write more flexible and readable code.

All the code samples in this chapter are in Scala.

Curly Braces (and Function Literals)

There’s a simple rule in Scala:

Any method call which accepts exactly one argument can use curly braces to surround the argument instead of parentheses.

So, instead of this:

numerals.foreach(println(_))

You can write this:

numerals.foreach{println(_)}

All we’ve done is swap the brackets for curly braces. Not very impressive, but things start to look a bit more interesting when we introduce some new lines.

numerals.foreach {

println(_)

}

Now it begins to look like a built-in control structure. Developers are used to interpreting curly braces as demarcation of language syntax. So this looks more like the built-in for loop, even though it’s just a method call.

The main reason for doing this is to allow clients to pass in functions as arguments in a natural and concise way. When you write functions that can take functions as arguments, you’re creating higher-order functions. These allow for greater flexibility and re-use.

For example, let’s say we want to do some work and update a UI element, like a progress bar or a customer basket. The best way to do this is in a new thread so that we don’t slow down the main UI thread and cause pauses for the user.

Higher-Order Functions

If every call to update a UI element must be done on its own thread, we might end up with a naive implementation like this:

object Ui {

def updateUiElements() {

new Thread() {

override def run(): Unit = updateCustomerBasket(basket)

}.start()

new Thread() {

override def run(): Unit = updateOffersFor(customer)

}.start()

}

}

The Ui object executes the sequence of updates one after another, each on a new thread. The Ui object is managing the threading policy and the update behaviour. It would be better if something else was responsible for coordinating threading and the Ui object was left to the update behaviour. That way, we could avoid duplication and if the threading policy changes, we wouldn’t have to find all the usages scattered about the place.

The solution is to define a function that can run some other function on a thread. We could create a function called runInThread with the boilerplate threading code.

def runInThread() {

new Thread() {

override def run(): Unit = ???

}.start()

}

It will create and start a new thread but it doesn’t do anything interesting. How do we pass in a function? In Java, you’d probably pass in a anonymous instance of a Runnable or Callable or a lambda.

You do the same in Scala but rather than pass in a functional interface as the argument, you pass in a shorthand signature denoting a function argument. You define a variable as usual (function in our example below) but the type that follows the colon represents a function. Our example has no arguments and returns a value of Unit. It’s equivalent to Java 8’s signature for a lambda: () -> Void.

def runInThread(function: () => Unit) {

new Thread() {

override def run(): Unit = ???

}.start()

}

Then we just execute the function in the body of the thread. Remember the brackets denote the shorthand for executing the apply method.

def runInThread(function: () => Unit) {

new Thread() {

override def run(): Unit = function() // aka function.apply()

}.start()

}

Given the new runInThread method, we can rewrite the UI code like this:

def updateUiElements() {

runInThread(() => updateCustomerBasket(basket))

runInThread(() => updateOffersFor(customer))

}

We’ve eliminated the duplication by passing in functions to runInThread.

Higher Order Functions with Curly Braces

This doesn’t really live up to the promise of clients being able to pass functions as arguments “in a natural and concise way”. It looks a lot like Java’s lambda syntax, but we can make it look more natural and more like language syntax if we use the curly braces.

If we just replace the parentheses with curly braces, it doesn’t really improve things.

// yuk!

def updateUiElements() {

runInThread { () =>

updateCustomerBasket(basket)

}

runInThread { () =>

updateOffersFor(customer)

}

}

But we can employ another trick to get rid of the empty parentheses and arrows. We can used what’s called a call-by-name parameter.

Call-by-Name

In Java, you can’t do anything about an empty lambda argument list (e.g., () -> Void) but in Scala, you can drop the brackets from a function signature to indicate that the argument is call-by-name. To invoke it, you no longer need to call the apply method. Instead, you simply reference it.

def runInThread(function: => Unit) { // call-by-name

new Thread() {

override def run(): Unit = function // not function()

}.start()

}

The by-name parameter expression isn’t evaluated until it’s actually used; not when it’s defined. It behaves just like the longhand function did even though it looks like we’re calling the function at the point where we pass it into our runInThread method.

def updateUiElements() {

runInThread {

updateCustomerBasket(basket)

}

runInThread {

updateOffersFor(customer)

}

}

This starts to make things look a lot more natural, especially if we want to do more within a running thread. For example, let’s say we want to apply a discount before updating a customer’s basket. The braces and indents make it very clear that this happens in the same thread as the update.

def updateUiElements() {

runInThread {

applyDiscountToBasket(basket)

updateCustomerBasket(basket)

}

runInThread {

updateOffersFor(customer)

}

}

You can think of it as shorthand for creating a parameter-less lambda.

Call-by-name != Lazy

People often think that by-name parameters are the same thing as lazy values but this isn’t technically accurate. Yes, they aren’t evaluated until they’re encountered at runtime but unlike true lazy values, they will be evaluated every time they’re encountered.

True lazy values are evaluated the first time they’re encountered and stored so the second time you ask for the value, it’s just returned, not evaluated again.

So by-name parameters are not lazy.

Currying

Using the apply method and curly braces allows us to create APIs that are expressive and natural to use. It allows us to create control abstractions that conform to what we already expect from the language in terms of syntax.

But remember what we said earlier about the curly braces rule.

Any method call which accepts exactly one argument can use curly braces to surround the argument instead of parentheses.

We can only use curly braces with single-argument methods. What if we want to add an argument to our runInThread method and still use the elegant syntax? The good news is that it’s entirely possible; we employ a technique called currying.

Let’s extend our runInThread method to add a new argument to assign a thread group.

def runInThread(group: String, function: => Unit) {

new Thread(new ThreadGroup(group), new Runnable() {

def run(): Unit = function

}).start()

}

As only single-argument lists can use braces, we have to regress the Ui object back to using parentheses.

// yuk!

def updateUiElements() {

runInThread("basket", {

applyDiscountToBasket(basket)

updateCustomerBasket(basket)

})

runInThread("customer",

updateOffersFor(customer)

)

}

If we could convert our function with two arguments into a function that takes one argument we’d be able to use the curly braces again. Fortunately for us, that’s exactly what currying is about. Currying is the process of turning a function of two or more arguments into a series of functions, each taking a single argument.

For a function of two arguments, currying would produce a function that takes one argument and returns another function. This returned function would also have a single argument (for what would have been the second argument of the original function). Confused? Let’s work through an example.

Let’s say we have a function f that takes two arguments, a and b, and returns a + b.

f\left ( a, b \right ) = a + b

To convert this into two functions, each with a single argument, first we create a function to take a and give back a new function (f′).

f\left ( a \right )\rightarrow f′

This new function should itself take a single argument, b.

f\left ( a \right )\rightarrow f′\left ( b \right )

That entire function should return the result, a + b

f\left ( a \right )\rightarrow f′\left ( b \right )\rightarrow a + b

We’re left with two functions (f and f′), each taking a single argument.

With the pseudo-mathematical notation on the same page, it’s worth restating my original definition and comparing the original to the curried form of the function.

For a function of two arguments, currying would produce a function that takes one argument and returns another function. This returned function would also have a single argument (for what would have been the second argument of the original function).

Fig. 3.1. Original function and steps to arrive at its curried form.

Fig. 3.1. Original function and steps to arrive at its curried form.

To evaluate the functions of the curried form, we’d evaluate the first function (for example, passing in a value 1).

f\left ( 1 \right )

This would return a function which captures the value, and because what’s returned is a function, we can just evaluate it, providing a value for the last argument (2).

f\left ( 1 \right )\left ( 2 \right )

At this point, both values are in scope and any computation can be applied giving the final result.

I’ve been using a bit of a gorilla notation1 to get my point across here. Using a more mathematically correct notation, we could show the function as being curried by creating a new function taking a and mapping b to a + b.

f\left ( a \right ) = \left ( b \mapsto a + b \right )

If you’re familiar with the lambda calculus2, you’ll already know that \lambda ab.a + b is shorthand for its curried form \lambda a.\left ( \lambda b.\left ( a + b \right ) \right ).

Closures

Interestingly, the process of capturing a value and making it available to a second function like this is called closure. It’s where we get the term closure from when referring to anonymous functions or lambdas that capture values.

Scala Support for Curried Functions

A regular uncurried function to add two numbers might look like this:

def add(x: Int, y: Int): Int = x + y

Scala supports curried functions out of the box, so we don’t need to do any manual conversion; all we do to turn this into its curried version is to separate out the arguments using parentheses.

def add(x: Int)(y: Int): Int = x + y

Scala has created two single-argument parameter lists for us. To evaluate the function, we’d do the following:

scala> add(1)(2)

res1: Int = 3

To see it in stages, we could just evaluate the first half like this:

scala> val f = add(1) _

f: Int => Int = <function1>

The underscore gives the REPL a hint about what we’re trying to do. The result f is a function from Int to Int. The value 1 has been captured and is available to that function. So we can now just execute the returned function supplying our second value.

scala> f(2)

res2: Int = 3

So what does this mean for our runInThread method? Well, if we create a curried version of the function, we can get back to using our lovely curly braces.

We start by splitting the argument into two to create the curried form of the original.

def runInThread(group: String)(function: => Unit) {

new Thread(new ThreadGroup(group), new Runnable() {

def run(): Unit = function

}).start()

}

Notice there are no other changes to make to the function. Inside runInThread everything is just as it was. However, we can now change the Ui object back to using curly braces for the second argument.

def updateUiElements() {

runInThread("basket") {

applyDiscountToBasket(basket)

updateCustomerBasket(basket)

}

runInThread("customer",

updateOffersFor(customer)

)

}

Summary

With a few built-in features, Scala allows us to write methods that look like language constructs. We can use higher-order functions to create control abstractions: functions that abstract over complex behaviour and reduce duplication yet still offer flexibility to the code that calls them.

We can use curly braces anywhere a single-argument method is used. We can use this to provide visual cues and patterns that are immediately recognisable. Using built-in currying support, we’re not limited to using this only for single-argument functions; we can create even richer APIs by converting multiple-argument functions into multiple single-argument functions.

1. A discussion of the notation used can be found at http://bit.ly/1Q2bU6s

2. Some notes on the Lambda Calculus can be found at http://bit.ly/1G4OdVo

Pattern Matching

As well as providing switch-like functionality (that’s more powerful than Java’s version), pattern matching offers a rich set of “patterns” that can be used to match against. In this chapter, we’ll look at the anatomy of patterns and talk through some examples, including literal, constructor and type query patterns.

Pattern matching also provides the ability to deconstruct matched objects, giving you access to parts of a data structure. We’ll look at the mechanics of deconstruction: extractors, which are basically objects with the special method unapply implemented.

Switching

Let’s start by looking at the pattern match expression from earlier.

val month = "August"

val quarter = month match {

case "January" | "February" | "March" => "1st quarter"

case "April" | "May" | "June" => "2nd quarter"

case "July" | "August" | "September" => "3rd quarter"

case "October" | "November" | "December" => "4th quarter"

case _ => "unknown quarter"

}

There are several key differences between Java’s switch and Scala’s match expression:

· There is no fall-through behaviour between cases in Scala. Java uses break to avoid a fall-through but Scala breaks between each case automatically.

· In Scala, a pattern match is an expression; it returns a value. Java switches must have side effects to be useful.

· We can switch on a wider variety of things with Scala, not just primitives, enums and strings. We can switch on objects, and things that fit a “pattern” of our own design. In the example, we’re using “or” to build a richer match condition.

Pattern matching also gives us:

· The ability to guard the conditions of a match; using an if, we can enrich a case to match not only on the pattern (the part straight after the case) but also on some binary condition.

· Exceptions for failed matches; when a value doesn’t match anything at runtime, Scala will throw a MatchError exception letting us know.

· Optional compile-time checks: you can set it up so that if you forget to write a case to match all possible combinations, you’ll get a compiler warning. This is done using what’s called sealed traits.

Patterns

The anatomy of a match expression looks like this:

value match {

case pattern guard => expression

...

case _ => default

}

So we have a value, then the match keyword, followed by a series of match cases. The value can itself be an expression, a literal or even an object.

Each case is made up of a pattern, optionally a guard condition, and the expression to evaluate on a successful match.

You might add a default, catch-all pattern at the end. The underscore is our first example of an actual pattern. It’s the wildcard pattern and means “match on anything”.

A pattern can be:

· A wildcard match (_).

· A literal match, meaning equality, used for values such as 101 or RED.

· A constructor match, meaning that a value would match if it could have been created using a specific constructor.

· A deconstruction match, otherwise known as an extractor pattern.

· A match based on a specific type, known as a type query pattern.

· A pattern with alternatives (specified with |).

Patterns can also include a variable name, which on matching will be available to the expression on the right-hand side. It’s what’s referred to as a variable ID in the language specification.

There are some more which I’ve left off; if you’re interested see the Pattern Matching section of the Scala Language Specification.

Literal Matches

A literal match is a match against any Scala literal. The example below uses a string literal and has similar semantics to a Java switch statement.

val language = "French"

value match {

case "french" => println("Salut")

case "French" => println("Bonjour")

case "German" => println("Guten Tag")

case _ => println("Hi")

}

The value must exactly match the literal in the case. In the example, the result will be to print “Bonjour” and not “Salut” as the match value has a capital F. The match is based on equality (==).

Constructor Matches

Constructor patterns allow you to match a case against how an object was constructed. Let’s say we have a SuperHero class that looks like this:

case class

SuperHero(heroName: String, alterEgo: String, powers: List[Ability])

It’s a regular class with three constructor arguments, but the keyword case at the beginning designates it as a case class. For now, that just means that Scala will automatically supply a bunch of useful methods for us, like hashCode, equals, and toString.

Given the class and its fields, we can create a match expression like this:

1 object BasicConstructorPatternExample extends App {

2 val hero =

3 new SuperHero("Batman", "Bruce Wayne", List(Speed, Agility, Strength))

4

5 hero match {

6 case SuperHero(_, "Bruce Wayne", _) => println("I'm Batman!")

7 case SuperHero(_, _, _) => println("???")

8 }

9 }

Using a constructor pattern, it will match for any hero whose alterEgo field matches the value “Bruce Wayne” and print “I’m Batman”. For everyone else, it’ll print question marks.

The underscores are used as placeholders for the constructor arguments; you need three on the second case (line 7) because the constructor has three arguments. The underscore means you don’t care what their values are. Putting the value “Bruce Wayne” on line 6 means you do care and that the second argument to the constructor must match it.

With constructor patterns, the value must also match the type. Let’s say that SuperHero is a subtype of a Person.

Fig. 3.2. `SuperHero` is a subtype of `Person`.

Fig. 3.2. SuperHero is a subtype of Person.

If the hero variable was actually an instance of Person and not a SuperHero, nothing would match. In the case of no match, you’d see a MatchError exception at runtime. To avoid the MatchError, you’d need to allow non-SuperHero types to match. To do that, you could just use a wildcard as a default.

object BasicConstructorPatternExample extends App {

val hero = new Person("Joe Ordinary")

hero match {

case SuperHero(_, "Bruce Wayne", _) => println("I'm Batman!")

case SuperHero(_, _, _) => println("???")

case _ => println("I'm a civilian, don't shoot!")

}

}

Patterns can also bind a matched value to a variable. Instead of just matching against a literal (like “Bruce Wayne”) we can use a variable as a placeholder and access a matched value in the expression on the right-hand side. For example, we could ask the question:

“What super-powers does an otherwise unknown person have, if they are a superhero with the alter-ego Bruce Wayne?”

1 def superPowersFor(person: Person) = {

2 person match {

3 case SuperHero(_, "Bruce Wayne", powers) => powers

4 case _ => List()

5 }

6 }

7

8 println("Bruce has the following powers " + superPowersFor(person))

We’re still matching only on types of SuperHero with a literal match against their alter-ego, but this time the underscore in the last position on line 3 is replaced with the variable powers. This means we can use the variable on the right-hand side. In this case, we just return it to answer the question.

Variable binding is one of pattern matching’s key strengths. In practice, it doesn’t make much sense to use a literal value like “Bruce Wayne” as it limits the application. Instead, you’re more likely to replace it with either a variable or wildcard pattern.

object HeroConstructorPatternExample extends App {

def superPowersFor(person: Person) = {

person match {

case SuperHero(_, _, powers) => powers

case _ => List()

}

}

}

You’d then use values from the match object as input. To find out what powers Bruce Wayne has, you’d pass in a SuperHero instance for Bruce.

val bruce =

new SuperHero("Batman", "Bruce Wayne", List(Speed, Agility, Strength))

println("Bruce has the following powers " + superPowersFor(bruce))

The example is a little contrived as we’re using a match expression to return something that we already know. But as we’ve made the superPowersFor method more general purpose, we could also find out what powers any superhero or regular person has.

val steve =

new SuperHero("Captain America", "Steve Rogers", List(Tactics, Speed))

val jayne = new Person("Jayne Doe")

println("Steve has the following powers " + superPowersFor(steve))

println("Jayne has the following powers " + superPowersFor(jayne))

Constructor Patterns

Note that constructor patterns work on case classes out of the box. Technically, this is because they automatically implement a special method called unapply. We’ll see shortly how you can implement your own and achieve the same kind of thing for non-case classes.

Type Query

Using a constructor pattern, you can implicitly match against a type and access its fields. If you don’t care about the fields, you can use a type query to match against just the type.

For example, we could create a method nameFor to give us a person or superhero’s name, and call it with a list of people. We’d get back either their name, or if they’re a superhero, their alter ego.

1 object HeroTypePatternExample extends App {

2

3 val batman =

4 new SuperHero("Batman", "Bruce Wayne", List(Speed, Agility, Strength))

5 val cap =

6 new SuperHero("Captain America", "Steve Rogers", List(Tactics, Speed))

7 val jayne = new Person("Jayne Doe")

8

9 def nameFor(person: Person) = {

10 person match {

11 case hero: SuperHero => hero.alterEgo

12 case person: Person => person.name

13 }

14 }

15

16 // What's a superhero's alter ego?

17 println("Batman's Alter ego is " + nameFor(batman))

18 println("Captain America's Alter ego is " + nameFor(cap))

19 println("Jayne's Alter ego is " + nameFor(jayne))

20 }

Rather than use a sequence of instanceOf checks followed by a cast, you can specify a variable and type. In the expression that follows the arrow, the variable can be used as an instance of that type. So on line 11, hero is magically an instance of SuperHero and SuperHero specific methods (likealterEgo) are available without casting.

When you use pattern matching to deal with exceptions in a try and catch, it’s actually type queries that are being used.

try {

val url = new URL("http://baddotrobot.com")

val reader = new BufferedReader(new InputStreamReader(url.openStream))

var line = reader.readLine

while (line != null) {

line = reader.readLine

println(line)

}

} catch {

case _: MalformedURLException => println("Bad URL")

case e: IOException => println("Problem reading data : " + e.getMessage)

}

The underscore in the MalformedURLException match shows that you can use a wildcard with type queries if you’re not interested in using the value.

Deconstruction Matches and unapply

It’s common to implement the apply method as a factory-style creation method; a method taking arguments and giving back a new instance. You can think of the special case unapply method as the opposite of this. It takes an instance and extracts values from it; usually the values that were used to construct it.

apply\left ( a,b \right )\rightarrow object\left ( a,b \right )unapply\left ( object\left ( a,b \right )) \right )\rightarrow a,b

Because they extract values, objects that implement unapply are referred to as extractors.

Given an object, an extractor typically extracts the parameters that would have created that object.

So if we want to use our Customer in a match expression, we’d add an unapply method to its companion object.

class Customer(val name: String, val address: String)

object Customer {

def unapply(???) = ???

}

An unapply method always takes an instance of the object you’d like to deconstruct, in our case a Customer.

object Customer {

def unapply(customer: Customer) = ???

}

It should return either the extracted parts of the object or something to indicate it couldn’t be deconstructed. In Scala, rather than return a null to represent this, we return the option of a result. It’s the same idea as the Optional class in Java.

object Customer {

def unapply(customer: Customer): Option[???] = ???

}

The last piece of the puzzle is to work out what can optionally be extracted from the object: the type to put in the Option parameter. If you wanted to be able to extract just the customer name, the return would be Option[String], but we want to be able to extract both the name and address (and therefore be able to match on both name and address in a match expression).

The answer is to use a tuple, the data structure we saw earlier. It’s a way of returning multiple pieces of data in a single type.

object Customer {

def unapply(customer: Customer): Option[(String, String)] = {

Some((customer.name, customer.address))

}

}

We can now use a pattern match with our customer.

val customer = new Customer("Bob", "1 Church street")

customer match {

case Customer(name, address) => println(name + " " + address)

}

You’ll notice that this looks like our constructor pattern example. That’s because it’s essentially the same thing; we used a case class before which added an unapply method for us. This time, we created it ourselves. It’s both an extractor and, because there’s a symmetry with the constructor, a constructor pattern.

More specifically, the list of values to extract in a pattern must match those in a class’s primary constructor to be called a constructor pattern. See the language spec for details.

Why Write Your Own Extractors?

Why would you implement your own extractor method (unapply) when case classes already have one? It might be simply because you can’t or don’t want to use a case class or you may not want the match behaviour of a case class; you might want custom extraction behaviour (for example, returning Boolean from unapply to indicate a match with no extraction).

It might also be the case that you can’t modify a class but you’d like to be able to extract parts from it. You can write extractors for anything. For example, you can’t modify the String class but you still might want to extract things from it, like parts of an email address or a URL.

For example, the stand-alone object below extracts the protocol and host from a string when it’s a valid URL. It has no relationship with the String class but still allows us to write a match expression and “deconstruct” a string into a protocol and host.

object UrlExtractor {

def unapply(string: String): Option[(String, String)] = {

try {

val url = new URL(string)

Some((url.getProtocol, url.getHost))

} catch {

case _: MalformedURLException => None

}

}

}

val url = "http://baddotrobot.com" match {

case UrlExtractor(protocol, host) => println(protocol + " " + host)

}

This decoupling between patterns and the data types they work against is called representation independence (see Section 24.6) of Programming in Scala.

Guard Conditions

You can complement the patterns we’ve seen with if conditions.

customer.yearsACustomer = 3

val discount = customer match {

case YearsACustomer(years) if years >= 5 => Discount(0.50)

case YearsACustomer(years) if years >= 2 => Discount(0.20)

case YearsACustomer(years) if years >= 1 => Discount(0.10)

case _ if blackFriday(today) => Discount(0.10)

case _ => Discount(0)

}

The condition following the pattern is called a guard. You can reference a variable if you like, so we can say for customers of over five years, a 50% discount applies; two years, 20% and so on. If a variable isn’t required, that’s fine too. For example, we’ve got a case that says if no previous discount applies and today is Black Friday, give a discount of 10%.

Map and FlatMap

In this chapter, we’ll look at some of the functional programming features of Scala, specifically the ubiquitous map and flatMap functions. We’re interested in these because they’re closely related to the idea of monads, a key feature of functional programming.

Mapping Functions

You’ll see the map function on countless classes in Scala. It’s often described in the context of collections. Classes like List, Set, and Map all have it. For these, it applies a given function to each element in the collection, giving back a new collection based on the result of that function. You “map” some function over each element of your collection.

For example, you could create a function that works out how old a person is given the year of their birth.

import java.util.Calendar

def age(birthYear: Int) = {

val currentYear = Calendar.getInstance.get(Calendar.YEAR)

currentYear - birthYear

}

We could call the map function on a list of birth years, passing in the function to create a new list of ages.

val birthdays = List(1990, 1977, 1984, 1961, 1973)

birthdays.map(age)

The result would be a list of ages. We’ve transformed the year 1990 into an age of 25, for example.

res0: List[Int] = List(25, 38, 31, 54, 42)

Being a higher-order function, you could have written the function inline as a lambda like this:

birthdays.map(year => Calendar.getInstance.get(Calendar.YEAR) - year)

Using the underscore as a shorthand for the lambda’s parameter, it would look like this:

birthdays.map(Calendar.getInstance.get(Calendar.YEAR) - _)

It’s Like foreach

So map is a transforming function. For collections, it iterates over the collection applying some function, just like foreach does. The difference is that unlike foreach, map will collect the return values from the function into a new collection and then return that collection.

It’s trivial to implement a mapping function by hand. For example, we could create a class Mappable that takes a number of elements of type A and creates a map function.

class Mappable[A](val elements: List[A]) {

def map[B](f: Function1[A, B]): List[B] = {

???

}

}

The parameter to map is a function that transforms from type A to type B; it takes an A and returns a B. I’ve written it longhand as a type of Function1 which is equivalent to Java 8’s java.util.function.Function class. We can also write it using Scala’s shorthand syntax and the compiler will do the conversion for us.

def map[B](f: A => B): List[B] = ...

Then it’s just a question of creating a new collection, calling the function (using apply) with each element as the argument. We’d store the result to the new collection and finally return it.

class Mappable[A](val elements: List[A]) {

def map[B](f: A => B): List[B] = {

val result = collection.mutable.MutableList[B]()

elements.foreach {

result += f.apply(_)

}

result.toList

}

}

We can test it by creating a list of numbers, making them “mappable” by creating a new instance of Mappable and calling map with an anonymous function that simply doubles the input.

object Example extends App {

val numbers: List[Int] = List(1, 2, 54, 4, 12, 43, 54, 23, 34)

val mappable: Mappable[Int] = new Mappable(numbers)

val result = mappable.map(_ * 2)

println(result)

}

The output would look like this:

List(2, 4, 108, 8, 24, 86, 108, 46, 68)

Recursion

This is a fairly typical iterative implementation; a more Scala-esq implementation would use recursion.

FlatMap

You’ll often see the flatMap function where you see the map function. For collections, it’s very similar in that it maps a function over the collection, storing the result in a new collection, but with a couple of differences:

· It still transforms but this time the function applies a one-to-many transformation. It takes a single argument as before but returns multiple values.

· The result would therefore end up being a collection of collections, so flatMap also flattens the result to give a single collection.

So,

· For a given collection of A, the map function applies a function to each element transforming an A to B. The result is a collection of B (i.e. List[B]).

· For a given collection of A, the flatMap function applies a function to each element transforming an A to a collection of B. This results in a collection of collection of B (i.e. List[List[B]]) which is the flattened to a single collection of B (i.e. List[B]).

Let’s say we want a mapping function to return a person’s age plus or minus a year. So if we think a person is 38, we’d return a list of 37, 38, 39.

import java.util.Calendar

def ages(birthYear: Int): List[Int] = {

val today = Calendar.getInstance.get(Calendar.YEAR)

List(today - 1 - birthYear, today - birthYear, today + 1 - birthYear)

}

The signature has changed from the previous example to return a List[Int] rather than just an Int. If we pass the list of birthday years into the map function, we get a list of lists back (res0 below).

val birthdays = List(1990, 1977, 1984)

val ages = birthdays.map(ages)

println(ages)

scala> birthdays.map(age)

res0: List[List[Int]] =

List(List(24, 25, 26), List(37, 38, 39), List(30, 31, 32))

If, however, we pass it into the flatMap function, we get a flattened list back. It maps, then flattens.

scala> birthdays.flatMap(age)

res1: List[Int] = List(24, 25, 26, 37, 38, 39, 30, 31, 32)

If you wanted to write your own version of flatMap, it might look something like this (notice the return type of the function).

class FlatMappable[A](elements: A*) {

def flatMap[B](f: A => List[B]): List[B] = {

val result = collection.mutable.MutableList[B]()

elements.foreach {

f.apply(_).foreach {

result += _

}

}

result.toList

}

}

The first loop will enumerate the elements of the collection and apply the function to each. Because this function itself returns a list, another loop is needed to enumerate each of these, adding them into the result collection. This is the bit that flattens the function’s result.

To test it, let’s start by creating a function that goes from an Int to a collection of Int. It gives back all the odd numbers between zero and the argument.

def oddNumbersTo(end: Int): List[Int] = {

val odds = collection.mutable.MutableList[Int]()

for (i <- 0 to end) {

if (i % 2 != 0) odds += i

}

odds.toList

}

We then just create an instance of our class with a few numbers in. Call flatMap and you’ll see that all odd numbers from 0 to 1, 0 to 2, and 0 to 10 are collected into a list.

object Example {

def main(args: Array[String]) {

val mappable = new FlatMappable(1, 2, 10)

val result = mappable.flatMap(oddNumbersTo)

println(result)

}

}

The output would be the following:

List(1, 1, 1, 3, 5, 7, 9)

Not Just for Collections

We’ve seen how map and flatMap work for collections, but they also exist on many other classes. More generally, map and flatMap operate on what’s called monads. In fact, having map and flatMap behaviour is one of the defining features of monads.

So just what are monads? We’ll look at that next.

Monads

Monads are one of those things that people love to talk about but which remain elusive and mysterious. If you’ve done any reading on functional programming, you will have come across the term.

Despite all the literature, the subject is often not well understood, partly because monads come from the abstract mathematics field of category theory and partly because, in programming languages, Haskell dominates the literature. Neither Haskell nor category theory are particularly relevant to the mainstream developer and both bring with them concepts and terminology that can be challenging to get your head around.

The good news is that you don’t have to worry about any of that stuff. You don’t need to understand category theory for functional programming. You don’t need to understand Haskell to program with Scala.

Basic Definition

A layman’s definition of a monad might be:

Something that has map and flatMap functions.

This isn’t the full story, but it will serve us as a starting point.

We’ve already seen that collections in Scala are all monads. It’s useful to transform these with map and flatten one-to-many transformations with flatMap. But map and flatMap do different things on different types of monads.

Option

Let’s take the Option class. You can use Option as a way of avoiding nulls, but just how does it avoid nulls and what has it got to do with monads? There are two parts to the answer:

1. You avoid returning null by returning a subtype of Option to represent no value (None) or a wrapper around a value (Some). As both “no value” and “some value” are of type Option, you can treat them consistently. You should never need to say “if not null”.

2. How you actually go about treating Option consistently is to use the monadic methods map and flatMap. So Option is a monad.

Null Object Pattern

If you’ve ever seen the Null Object pattern, you’ll notice it’s a similar idea. The Null Object pattern allows you to replace a type with a subtype to represent no value. You can call methods on the instance as if it were a real value but it essentially does nothing. It’s substitutable for a real value but usually has no side effects.

The main difference is that the methods you can call, defined by the instance’s super-type, are usually business methods. The common methods of a monad are map and flatMap and are lower level, functional programming abstractions.

We know what map and flatMap do for collections, but what do they do for an Option?

The map Function

The map function still transforms an object, but it’s an optional transformation. It will apply the mapping function to the value of an option, if it has a value. The value and no value options are implemented as subclasses of Option: Some and None respectively.

Fig. 3.3. The `Option` classes.

Fig. 3.3. The Option classes.

A mapping only applies if the option is an instance of Some. If it has no value (i.e., it’s an instance of None), it will simply return another None.

This is useful when you want to transform something but not worry about checking if it’s null. For example, we might have a Customers trait with repository methods add and find. What should we do in implementations of find when a customer doesn’t exist?

trait Customers extends Iterable[Customer] {

def add(Customer: Customer)

def find(name: String): Customer

}

A typical Java implementation would likely return null or throw some kind of NotFoundException. For example, the following Set-based implementation returns a null if the customer cannot be found:

class CustomerSet extends Customers {

private val customers = mutable.Set[Customer]()

def add(customer: Customer) = customers.add(customer)

def find(name: String): Customer = {

for (customer <- customers) {

if (customer.name == name)

return customer

}

null

}

def iterator: Iterator[Customer] = customers.iterator

}

Returning null and throwing exceptions both have similar drawbacks.

Neither communicate intent very well. If you return null, clients need to know that’s a possibility so they can avoid a NullPointerException. But what’s the best way to communicate that to clients? ScalaDoc? Ask them to look at the source? Both are easy for clients to miss. Exceptions may be somewhat clearer but as Scala exceptions are unchecked, they’re just as easy for clients to miss.

You also force unhappy path handling to your clients. Assuming that consumers do know to check for a null, you’re asking multiple clients to implement defensive strategies for the unhappy path. You’re forcing null checks on people and can’t ensure consistency, or even that people will bother.

Defining the find method to return an Option improves the situation. Below, if we find a match, we return Some customer or None otherwise. This communicates at an API level that the return type is optional. The type system forces a consistent way of dealing with the unhappy path.

trait Customers extends Iterable[Customer] {

def add(Customer: Customer)

def find(name: String): Option[Customer]

}

Our implementation of find can then return either a Some or a None.

def find(name: String): Option[Customer] = {

for (customer <- customers) {

if (customer.name == name)

return Some(customer)

}

None

}

Let’s say that we’d like to find a customer and get their total shopping basket value. Using a method that can return null, clients would have to do something like the following, as Albert may not be in the repository.

val albert = customers.find("Albert") // can return null

val basket = if (albert != null) albert.total else 0D

If we use Option, we can use map to transform from an option of a Customer to an option of their basket value.

val basketValue: Option[Double] =

customers.find("A").map(customer => customer.total)

Notice that the return type here is an Option[Double]. If Albert isn’t found, map will return a None to represent no basket value. Remember that the map on Option is a optional transformation.

When you want to actually get hold of the value, you need to get it out of the Option wrapper. The API of Option will only allow you call get, getOrElse or continue processing monadically using map and flatMap.

Option.get

To get the raw value, you can use the get method but it will throw an exception if you call it against no value. Calling it is a bit of a smell as it’s roughly equivalent to ignoring the possibility of a NullPointerException. You should only call it when you know the option is a Some.

// could throw an exception

val basketValue = customers.find("A").map(customer => customer.total).get

To ensure the value is a Some, you could pattern match like the following, but again, it’s really just an elaborate null check.

val basketValue: Double = customers.find("Missing") match {

case Some(customer) => customer.total // avoids the exception

case None => 0D

}

Option.getOrElse

Calling getOrElse is often a better choice as it forces you to provide a default value. It has the same effect as the pattern match version, but with less code.

val basketValue =

customers.find("A").map(customer => customer.total).getOrElse(0D)

Monadically Processing Option

If you want to avoid using get or getOrElse, you can use the monadic methods on Option. To demonstrate this, we need a slightly more elaborate example. Let’s say we want to sum the basket value of a subset of customers. We could create the list of names of customers we’re interested in and find each of these by transforming (mapping) the customer names into a collection of Customer objects.

In the example below, we create a customer database, adding some sample data before mapping.

val database = Customers()

val address1 = Some(Address("1a Bridge St", None))

val address2 = Some(new Address("2 Short Road", Some("AL1 2PY")))

val address3 = Some(new Address("221b Baker St", Some("NW1")))

database.add(new Customer("Albert", address1))

database.add(new Customer("Beatriz", None))

database.add(new Customer("Carol", address2))

database.add(new Customer("Sherlock", address3))

val customers = Set("Albert", "Beatriz", "Carol", "Dave", "Erin")

customers.map(database.find(_))

We can then transform the customers again to a collection of their basket totals.

customers.map(database.find(_).map(_.total))

Now here’s the interesting bit. If this transformation were against a value that could be null, and not an Option, we’d have to do a null check before carrying on. However, as it is an option, if the customer wasn’t found, the map would just not do the transformation and return another “no value” Option.

When finally we want to sum all the basket values and get a grand total, we can use the built-in function sum.

customers.map(database.find(_).map(_.total)).sum // wrong!

However, this isn’t quite right. Chaining the two map functions returns a Set[Option[Double]], and we can’t sum that. We need to flatten this down to a sequence of doubles before summing.

customers.map(database.find(_).map(_.total)).flatten.sum

^

notice the position here, we map immediately on Option

The flattening will discard any Nones, so afterwards the collection size will be 3. Only Albert, Carol, and Beatriz’s baskets get summed.

The Option.flatMap Function

Above, we replicated flatMap behaviour by mapping and then flattening, but we could have used flatMap on Option directly.

The first step is to call flatMap on the names instead of map. As flatMap does the mapping and then flattens, we immediately get a collection of Customer.

val r: Set[Customer] = customers.flatMap(name => database.find(name))

The flatten part drops all the Nones, so the result is guaranteed to contain only customers that exist in our repository. We can then simply transform those customers to their basket total, before summing.

customers

.flatMap(name => database.find(name))

.map(customer => customer.total)

.sum

Dropping the no value options is a key behaviour for flatMap here. For example, compare the flatten on a list of lists:

scala> val x = List(List(1, 2), List(3), List(4, 5))

x: List[List[Int]] = List(List(1), List(2), List(3))

scala> x.flatten

res0: List[Int] = List(1, 2, 3, 4, 5)

…to a list of options.

scala> val y = List(Some("A"), None, Some("B"))

y: List[Option[String]] = List(Some(A), None, Some(B))

scala> y.flatten

res1: List[String] = List(A, B)

More Formal Definition

As a more formal definition, a monad must:

· Operate on a parameterised type, which implies it’s a “container” for another type (this is called a type constructor).

· Have a way to construct the monad from its underlying type (the unit function).

· Provide a flatMap operation (sometimes called bind).

Option and List both meet these criteria.

Option

List

Parameterised (type constructor)

Option[A]

List[T]

Construction (unit)

Option.apply(x)

List(x, y, z)

Some(x)

None

flatMap (bind)

def flatMap[B](f: A => Option[B]): Option[B]

def flatMap[B](f: A => List[B]): List[B]

The definition doesn’t mention map, though, and our layman’s definition for monad was:

Something that has map and flatMap functions.

I wanted to introduce flatMap in terms of map because it always applies a mapping function before flattening. It’s true that to be a monad you only have provide flatMap but in practice monads also supply a map function. This is because all monads are also functors; it’s functors that more formally have to provide maps.

So the technical answer is that providing flatMap, a parameterised type, and the unit function makes something a monad. But all monads are functors and map comes from functor.

Fig. 3.4. The `Functor` and `Monad` behaviours.

Fig. 3.4. The Functor and Monad behaviours.

Summary

In this chapter, I explained that when people talk about monadic behaviour, they’re really just talking about the map and flatMap functions. The semantics of map and flatMap can differ depending on the type of monad but they share a formal, albeit abstract, definition.

We looked at some concrete examples of the monadic functions on List and Option, and how we can use these with Option to avoid null checks. The real power of monads, though, is in “chaining” these functions to compose behaviour into a sequence of simple steps. To really see this, we’re going to look at some more elaborate examples in the next chapter, and see how for comprehensions work under the covers.

For Comprehensions

The last chapter focused on monads and the map and flatMap functions. In this chapter we’re going to focus on just flatMap behaviour. Specifically, we’ll look at how to chain flatMap function calls before finally yielding results. For comprehensions actually use flatMap under the hood, so we’ll look at the relationship in detail and explain how for comprehensions work.

Where We Left Off

Hopefully you’re now comfortable with the idea of flatMap. We looked at it for the collection classes and for Option. Recall that we used flatMap to map over customer names that may or may not exist in our database. By doing so, we could sum customer basket values.

customers

.flatMap(name => database.find(name))

.map(customer => customer.total)

.sum

Now let’s say that we’d like to generate a shipping label for a customer. We can look up a customer in our repository and if they have a street address and a postcode, we can generate a shipping label.

The caveats are:

1. A customer may or may not exist in the repository.

2. A given customer may or may not have an address object.

3. An address object must contain a street but may or may not contain a postcode.

So, to generate a label, we need to:

1. Find a customer (who may or may not exist) by name.

2. Get the customer’s address (which also may or may not exist).

3. Given the address, get the shipping information from it. (We can expect an Address object to contain a street address, but it may or may not have a postcode.)

Using Null Checks

If we were to implement this where the optionality was expressed by returning nulls, we’d be forced to do a bunch of null checks. We have four customers: Albert, Beatriz, Carol, and Sherlock. Albert has an address but no postcode, Beatriz hasn’t given us her address, and the last two have full address information.

val customers = Customers()

val address1 = Some(Address("1a Bridge St", None))

val address2 = Some(new Address("2 Short Road", Some("AL1 2PY")))

val address3 = Some(new Address("221b Baker St", Some("NW1")))

customers.add(new Customer("Albert", address1))

customers.add(new Customer("Beatriz", None))

customers.add(new Customer("Carol", address2))

customers.add(new Customer("Sherlock", address3))

Given a list of customers, we can attempt to create shipping labels. As you can see, the list below includes people that don’t exist in the database.

val all = Set("Albert", "Beatriz", "Carol", "Dave", "Erin", "Sherlock")

Next, we create a function to return the list of shipping labels, collecting them in a mutable set. For every name in our list, we attempt to find the customer in the database (using customers.find). As this could return null, we have to check the returned value isn’t null before we can get their address.

Getting the address can return null, so we have to check for null again before getting their postcode. Once we’ve checked the postcode isn’t null, we can finally call a method (shippingLabel) to create a label and add it to the collection. Were we to run it, only Carol and Sherlock would get through all the null checks.

def generateShippingLabels() = {

val labels = mutable.Set[String]()

all.foreach { name =>

val customer: Customer = customers.find(name)

if (customer != null) {

val address: Address = customer.address

if (address != null) {

val postcode: String = address.postcode

if (postcode != null) {

labels.add(

shippingLabel(customer.name, address.street, postcode))

}

}

}

}

labels

}

def shippingLabel(name: String, street: String, postcode: String) = {

"Ship to:\n" + "========\n" + name + "\n" + street + "\n" + postcode

}

Using flatMap with Option

If, instead of returning null for no customer, we were to use Option as the return type, we could reduce the code using flatMap.

1 def generateShippingLabel(): Set[String] = {

2 all.flatMap {

3 name => customers.find(name).flatMap {

4 customer => customer.address.flatMap {

5 address => address.postcode.map {

6 postcode => {

7 shippingLabel(customer.name, address.street, postcode)

8 }

9 }

10 }

11 }

12 }

13 }

14

15 def shippingLabel(name: String, street: String, postcode: String) = {

16 "Ship to:\n" + "========\n" + name + "\n" + street + "\n" + postcode

17 }

We start in the same way as before, by enumerating each of the names in our list, calling find on the database for each. We use flatMap to do this as we’re transforming from a single customer name (String) to a monad (Option).

You can think of the option as being like a list with one element in it (either a Some or a None), so we’re doing a “one-to-many”-like transformation. As we saw in the flatMap section, this implies we’ll need to flatten the “many” back down into “one” later, hence the flatMap.

After the initial flatMap where we find a customer in the database, we flatMap the result. If no customer was found, it wouldn’t continue any further. So on line 4, we can be sure a customer actually exists and can go ahead and get their address. As address is optional, we can flatMap again, dropping out if a customer doesn’t have an address.

On line 5, we can request a customer’s postcode. Postcode is optional, so only if we have one do we transform it (and the address details) into a shipping label. The map call takes care of that for us; remember that map here only applies the function (shippingLabel) when we have a value (i.e., postcode is an instance of Some).

Notice that we didn’t need to create a mutable collection to store the shipping label. Any transformation function like map or flatMap will produce a new collection with the transformed results. So the final call to map on line 7 will put the shipping label into a newly created collection for us. One final comment: the resulting collection is of type String because the generateShippingLabel method returns a String.

How For Comprehensions Work

When you do a regular for loop, the compiler converts (or de-sugars) it into a method call to foreach.

for (i <- 0 to 5) {

println(i)

}

// is de-sugared as

(0 to 5).foreach(println)

A nested loop is de-sugared like this:

for (i <- 0 to 5; j <- 0 to 5) {

println(i + " " + j)

}

// is de-sugared as

(0 to 5).foreach { i =>

(0 to 5).foreach { j => {

println(i + " " + j)

}

}

}

If you do a for with a yield (a for comprehension) the compiler does something different:

for (i <- 0 to 5) yield {

i + 2

}

The yield is about returning a value. A for without a yield, although an expression, will return Unit. This is because the foreach method returns Unit. A for with a yield will return whatever is in the yield block. It’s converted into a call to map rather than foreach. So, the de-sugared form of the above would look like this:

// de-sugared form of "for (i <- 0 to 5) yield i + 2"

(0 to 5).map(i => i + 2)

It’s mapping a sequence of numbers (0 to 5) into another sequence of numbers (2 to 7).

It’s important to realise that whatever is in the yield block represents the function that’s passed into map. The map itself operates on whatever is in the for part (i.e., for (i <- 0 to 5)). It may be easier to recognise when we reformat the example above like this:

for {

i <- 0 to 5 // map operates on this collection

} yield {

i + 2 // the function to pass into map

}

It gets more interesting when we have nesting between the parentheses and the yield.

val x: Seq[(Int, Int)] = for {

i <- 0 to 5

j <- 0 to 5

} yield {

(i, j)

}

println(x)

Curly Braces or Parenthesis?

Notice how I’ve used curly braces instead of parentheses in some examples? It’s a more common style to use curly braces for nested for loops or loops with a yield block.

This will perform the nested loop like before but rather than translate to nested foreach calls, it translates to flatMap calls followed by a map. Again, the final map is used to transform the result using whatever is in the yield block.

// de-sugared

val x: Seq[(Int, Int)] = (0 to 5).flatMap {

i => (0 to 5).map {

j => (i, j)

}

}

It’s exactly the same as before; the yield block has provided the function to apply to the mapping function and what it maps over is determined by the for expression. In this example, we’re mapping two lists of 0 to 5 to a collection of tuples, representing their Cartesian product.

Seq((0,0), (0,1), (0,2), (0,3), (0,4), (0,5),

(1,0), (1,1), (1,2), (1,3), (1,4), (1,5),

(2,0), (2,1), (2,2), (2,3), (2,4), (2,5),

(3,0), (3,1), (3,2), (3,3), (3,4), (3,5),

(4,0), (4,1), (4,2), (4,3), (4,4), (4,5),

(5,0), (5,1), (5,2), (5,3), (5,4), (5,5))

If we break this down and go through the steps, we can see how we arrived at the de-sugared form. We start with two sequences of numbers; a and b.

val a = (0 to 5)

val b = (0 to 5)

When we map the collections, we get a collection of collections. The final map returns a tuple, so the return type is a sequence of sequences of tuples.

val x: Seq[Seq[(Int, Int)]] = a.map(i => b.map(j => (i, j)))

To flatten these to a collection of tuples, we have to flatten the two collections, which is what flatMap does. So although we could do the following, it’s much more straightforward to call flatMap directly.

val x: Seq[(Int, Int)] = a.map(i => b.map(j => (i, j))).flatten

// is equivalent to

val x: Seq[(Int, Int)] = a.flatMap(i => b.map(j => (i, j)))

Finally, Using a For Comprehension for Shipping Labels

What does all that mean for our shipping label example? We can convert our chained flatMap calls to use a for comprehension and neaten the whole thing up. We started with a sequence of chained calls to flatMap.

def generateShippingLabel_FlatMapClosingOverVariables(): Set[String] = {

all.flatMap {

name => customers.find(name).flatMap {

customer => customer.address.flatMap {

address => address.postcode.map {

postcode => shippingLabel(name, address.street, postcode)

}

}

}

}

}

After converting to the for comprehension, each call to flatMap is placed in the for as a nested expression. The final one represents the map call. Its argument (the mapping function) is what’s in the yield block.

def generateShippingLabel_ForComprehension(): Set[String] = {

for {

name <- all // <- flatMap

customer <- customers.find(name) // <- flatMap

address <- customer.address // <- flatMap

postcode <- address.postcode // <- map

} yield {

shippingLabel(name, address.street, postcode) // <- map argument

}

}

This is much more succinct. It’s easier to reason about the conditional semantics when it’s presented like this; if there’s no customer found, it won’t continue to the next line. If you want to extend the nesting, you can just add another line and not be bogged down by noise or indentation.

The syntax is declarative but mimics an imperative style. It doesn’t force a particular implementation on you. That’s to say, with imperative for loops, you don’t have a choice about how the loop is executed. If you wanted to do it in parallel, for example, you’d have to implement the concurrency yourself.

Using a declarative approach like this means that the underlying objects are responsible for how they execute, which gives you more flexibility. Remember, this just calls flatMap and classes are free to implement flatMap however they like.

For comprehensions work with any monad and you can use your own classes if you implement the monadic methods.

Summary

In this chapter we looked at chaining calls to flatMap in the context of printing shipping labels. We looked at how for comprehensions work and how they’re syntactic sugar over regular Scala method calls. Specifically, we looked at loops and nested loops, how they’re equivalent to callingforeach, how for loops with a yield block translate to mapping functions, and how nested loops with yield blocks translate to flatMap then map functions.

This last point is what allowed us to convert our lengthy shipping label example into a succinct for comprehension.