Functional Programming - Seven Concurrency Models in Seven Weeks (2014)

Seven Concurrency Models in Seven Weeks (2014)

Chapter 3. Functional Programming

Functional Programming is like a car powered by hydrogen fuel cells—advanced, futuristic, and not yet widely used, but it’s what we’ll all rely on in twenty years.

In contrast to an imperative program, which consists of a series of statements that change global state when executed, a functional program models computation as the evaluation of expressions. Those expressions are built from pure mathematical functions that are both first-class (can be manipulated like any other value) and side effect–free. It’s particularly useful when dealing with concurrency because the lack of side effects makes reasoning about thread safety much easier. It is also the first model we’ll look at that allows parallelism to be represented directly.

If It Hurts, Stop Doing It

The rules about locking that we discussed in Chapter 2, Threads and Locks, apply only to data that is both shared between threads and might change—in other words shared mutable state. Data that doesn’t change (is immutable) can be accessed by multiple threads without any kind of locking.

This is what makes functional programming so compelling when it comes to concurrency and parallelism—functional programs have no mutable state, so they cannot suffer from any of the problems associated with shared mutable state.

In this chapter we’re going to look at functional programming in Clojure,[9] a dialect of Lisp that runs on the JVM. Clojure is dynamically typed; and if you’re a Ruby or Python programmer, you’ll feel right at home once you get used to the unfamiliar syntax. Clojure is not a pure functional language, but in this chapter we’ll be concentrating on its purely functional subset. I’ll introduce the bits of Clojure that we’ll be using along the way, but if you want to learn more about the language I recommend Stuart Halloway and Aaron Bedra’s Programming Clojure [HB12].

In day 1 we’ll look at the basics of functional programming and see how it’s trivial to parallelize a functional algorithm. In day 2 we’ll dig deeper into Clojure’s reducers framework and see how this parallelization works under the hood. Finally, in day 3, we’ll switch our focus from parallelism to concurrency and create a concurrent functional web service with futures and promises.

Day 1: Programming Without Mutable State

When programmers first encounter functional programming, their reaction is often one of disbelief—that it can’t be possible to write nontrivial programs without modifying variables. We’ll see that it is not only possible but very often simpler and easier than creating normal imperative code.

The Perils of Mutable State

Today we’re going to concentrate on parallelism. We’ll construct a simple functional program and then show how, because it’s functional, it’s almost trivially easy to parallelize.

But first let’s look at a couple of examples in Java that show why it’s so helpful to avoid mutable state.

Hidden Mutable State

Here’s a class that doesn’t have any mutable state and should therefore be perfectly thread-safe:

FunctionalProgramming/DateFormatBug/src/main/java/com/paulbutcher/DateParser.java

class DateParser {

private final DateFormat format = new SimpleDateFormat("yyyy-MM-dd");

public Date parse(String s) throws ParseException {

return format.parse(s);

}

}

When I run a small example program that uses this class from multiple threads (you can see the source in the code that accompanies the book), I get the following:

Caught: java.lang.NumberFormatException: For input string: ".12012E4.12012E4"

Expected: Sun Jan 01 00:00:00 GMT 2012, got: Wed Apr 15 00:00:00 BST 2015

The next time I run it, I get this:

Caught: java.lang.ArrayIndexOutOfBoundsException: -1

And the next time, I get this:

Caught: java.lang.NumberFormatException: multiple points

Caught: java.lang.NumberFormatException: multiple points

Clearly the code isn’t thread-safe at all, but why? It only has a single member variable, and that’s immutable because it’s final.

The reason is that SimpleDateFormat has mutable state buried deep within. You can argue that this is a bug,[10] but for our purposes it doesn’t matter. The problem is that languages like Java make it both easy to write code with hidden mutable state like this and virtually impossible to tell when it happens—there’s no way to tell from its API that SimpleDateFormat isn’t thread-safe.

Hidden mutable state isn’t the only thing you need to be careful about, as we’ll see next.

Escapologist Mutable State

Imagine that you’re creating a web service that manages a tournament. Among other things, it’s going to need to manage a list of players, which you might be tempted to implement along these lines:

public class Tournament {

private List<Player> players = new LinkedList<Player>();

public synchronized void addPlayer(Player p) {

players.add(p);

}

public synchronized Iterator<Player> getPlayerIterator() {

return players.iterator();

}

}

At first glance, this looks like it should be thread-safe—players is private and accessed only via the addPlayer and getPlayerIterator methods, both of which are synchronized. Unfortunately, it is not thread-safe because the iterator returned by getPlayerIterator still references the mutable state contained within players—if another thread calls addPlayer while the iterator is in use, we’ll see a ConcurrentModificationException or worse. The state has escaped from the protection provided by Tournament.

Hidden and escaped state are just two of the dangers of mutable state in concurrent programs—there are plenty of others. These dangers would disappear if we could find a way to avoid mutable state entirely, which is exactly what functional programming enables us to do.

A Whirlwind Tour of Clojure

It takes only a few minutes to get the hang of Clojure’s Lisp syntax.

The easiest way to experiment with Clojure is through its REPL (read-evaluate-print loop), which you can start with lein repl (lein is the standard Clojure build tool). This allows you to type code and have it evaluated immediately without having to create source files and compile them, which can be amazingly helpful when experimenting with unfamiliar code. When the REPL starts, you should see the following prompt:

user=>

Any Clojure code you type at this prompt will be evaluated immediately.

Clojure code is almost entirely constructed from parenthesized lists called s-expressions. A function call that in most languages would be written max(3, 5) is written like this:

user=> (max 3 5)

5

The same is true of mathematical operators. Here’s 1 + 2 * 3, for example:

user=> (+ 1 (* 2 3))

7

Defining a constant is achieved with def:

user=> (def meaning-of-life 42)

#'user/meaning-of-life

user=> meaning-of-life

42

Even control structures are s-expressions:

user=> (if (< meaning-of-life 0) "negative" "non-negative")

"non-negative"

Although almost everything in Clojure is an s-expression, there are a few exceptions. Vector (array) literals are surrounded by square brackets:

user=> (def droids ["Huey" "Dewey" "Louie"])

#'user/droids

user=> (count droids)

3

user=> (droids 0)

"Huey"

user=> (droids 2)

"Louie"

And map literals are surrounded by curly brackets:

user=> (def me {:name "Paul" :age 45 :sex :male})

#'user/me

user=> (:age me)

45

Keys in maps are often keywords, which start with a colon and are very similar to symbols in Ruby or interned strings in Java.

Finally, a function is defined with defn, with arguments specified as a vector:

user=> (defn percentage [x p] (* x (/ p 100.0)))

#'user/percentage

user=> (percentage 200 10)

20.0

That concludes our whirlwind tour of Clojure. I’ll introduce other aspects of the language as we go.

Our First Functional Program

I’ve said that the most interesting thing about functional programming is that it avoids mutable state, but we haven’t actually seen an example yet. Let’s rectify that now.

Imagine that you want to find the sum of a sequence of numbers. In an imperative language like Java, you would probably write something like this:

public int sum(int[] numbers) {

int accumulator = 0;

for (int n: numbers)

accumulator += n;

return accumulator;

}

That isn’t functional because accumulator is mutable: it changes after each iteration of the for loop. By contrast, this Clojure solution has no mutable variables:

FunctionalProgramming/Sum/src/sum/core.clj

(defn recursive-sum [numbers]

(if (empty? numbers)

0

(+ (first numbers) (recursive-sum (rest numbers)))))

This is a recursive solution—recursive-sum calls itself (recurses). If numbers is empty, it simply returns zero. Otherwise, it returns the result of adding the first (head) element of numbers to the sum of the rest (tail) of the sequence.

Although our recursive solution works, we can do better. Here is a solution that’s both simpler and more efficient:

FunctionalProgramming/Sum/src/sum/core.clj

(defn reduce-sum [numbers]

(reduce (fn [acc x] (+ acc x)) 0 numbers))

This uses Clojure’s reduce function, which takes three arguments—a function, an initial value, and a collection.

In this instance, we’re passing it an anonymous function defined with fn that takes two arguments and returns their sum. It’s called once by reduce for each element in the collection—the first time, it’s passed the initial value (0 in this case) together with the first item in the collection; the second time, it’s passed the result of the first invocation together with the second item in the collection; and so on.

We’re not quite done yet—we can make this code better still by noticing that + is already a function that, when given two arguments, returns their sum. We can pass it directly to reduce without creating an anonymous function:

FunctionalProgramming/Sum/src/sum/core.clj

(defn sum [numbers]

(reduce + numbers))

So we’ve arrived at a solution that is both simpler and more concise than the imperative one. You’ll find that this is a common experience when converting imperative code to functional.

Joe asks:

Joe asks:

What If We Pass an Empty Collection to reduce?

Our final version of sum doesn’t pass an initial value to reduce:

(reduce + numbers)

This might make you wonder what happens if we give it an empty collection. The answer is that it does the right thing and returns zero:

sum.core=> (sum [])

0

But how does reduce know that zero (and not, say, 1 or nil) is the right thing to return? This relies on an interesting feature of many of Clojure’s operators—they know what their identity values are. The + function, for example, can take any number of arguments, including zero:

user=> (+ 1 2)

3

user=> (+ 1 2 3 4)

10

user=> (+ 42)

42

user=> (+)

0

When called with no arguments, it returns the additive identity, 0.

Similarly, * knows that the multiplicative identity is 1:

user=> (*)

1

If we don’t pass an initial value to reduce, it uses the result of calling the function it’s given with no arguments.

Incidentally, because + can take any number of arguments, this also means that we can implement sum with apply, which takes a function together with an vector and calls the function with the vector as arguments:

FunctionalProgramming/Sum/src/sum/core.clj

(defn apply-sum [numbers]

(apply + numbers))

But unlike the version that uses reduce, this can’t easily be parallelized.

Effortless Parallelism

So we’ve seen some functional code, but what about parallelism? What would we need to do to convert our sum function to operate in parallel? Very little, it turns out:

FunctionalProgramming/Sum/src/sum/core.clj

(ns sum.core

(:require [clojure.core.reducers :as r]))

(defn parallel-sum [numbers]

(r/fold + numbers))

The only difference is that we’re now using the fold function from the clojure.core.reducers package (which we alias to r to save typing) instead of using reduce.

Here’s a REPL session that shows what this buys us in terms of performance:

sum.core=> (def numbers (into [] (range 0 10000000)))

#'sum.core/numbers

sum.core=> (time (sum numbers))

"Elapsed time: 1099.154 msecs"

49999995000000

sum.core=> (time (sum numbers))

"Elapsed time: 125.349 msecs"

49999995000000

sum.core=> (time (parallel-sum numbers))

"Elapsed time: 236.609 msecs"

49999995000000

sum.core=> (time (parallel-sum numbers))

"Elapsed time: 49.835 msecs"

49999995000000

We start by creating a vector that contains all the integers between zero and ten million by inserting the result of (range 0 10000000) into an empty vector with into. Then we use the time macro, which prints the time taken by whatever code it’s given. As is often the case with code running on the JVM, we have to run more than once to give the just-in-time optimizer a chance to kick in and get a representative time.

So, on my four-core Mac, fold takes us from 125 ms to 50 ms, a 2.5x speedup. We’ll see how fold achieves this tomorrow, but before then let’s look at a functional version of our Wikipedia word-count example.

Counting Words Functionally

Today we’ll create a sequential implementation of word count—we’ll parallelize it tomorrow. We’re going to need to have three things:

· A function that, given a Wikipedia XML dump, returns a sequence of the pages contained within that dump

· A function that, given a page, returns a sequence of the words on that page

· A function that, given a sequence of words, returns a map containing the frequencies of those words

We’re not going to cover the first two of these in any detail—this is a book about concurrency, not string processing or XML (see the accompanying code if you’re interested in the details). We will look at how to count words, however, as that’s what we’ll be parallelizing.

Functional Maps

Because we want to return a map of word frequencies, we’ll need to understand a couple of Clojure’s map functions—get and assoc:

user=> (def counts {"apple" 2 "orange" 1})

#'user/counts

user=> (get counts "apple" 0)

2

user=> (get counts "banana" 0)

0

user=> (assoc counts "banana" 1)

{"banana" 1, "orange" 1, "apple" 2}

user=> (assoc counts "apple" 3)

{"orange" 1, "apple" 3}

So get simply looks up a key in the map and either returns its value or returns a default if the key isn’t in the map. And assoc takes a map together with a key and value and returns a new map with the key mapped to the value.

Frequencies

We now know enough to write a function that takes a sequence of words and returns a map in which each word is associated with the number of times it appears:

FunctionalProgramming/WordCount/src/wordcount/word_frequencies.clj

(defn word-frequencies [words]

(reduce

(fn [counts word] (assoc counts word (inc (get counts word 0))))

{} words))

This time we’re passing an empty map {} as the initial value to reduce. And then for each word in words, we add one more than the current count for that word. Here’s an example of it in use:

user=> (word-frequencies ["one" "potato" "two" "potato" "three" "potato" "four"])

{"four" 1, "three" 1, "two" 1, "potato" 3, "one" 1}

It turns out that the Clojure standard library has beaten us to it—there’s a standard function called frequencies that takes any collection and returns a map of the frequencies of its members:

user=> (frequencies ["one" "potato" "two" "potato" "three" "potato" "four"])

{"one" 1, "potato" 3, "two" 1, "three" 1, "four" 1}

Now that we can count words, all that remains is to wire things up with the XML processing.

More Sequence Functions

To see how to do that, we need to introduce a little more machinery. First, here’s the map function:

user=> (map inc [0 1 2 3 4 5])

(1 2 3 4 5 6)

user=> (map (fn [x] (* 2 x)) [0 1 2 3 4 5])

(0 2 4 6 8 10)

Given a function and a sequence, map returns a new sequence that contains the result of applying the function to each element of the sequence in turn.

We can simplify the second version slightly by using partial, which takes a function together with one or more arguments and returns a partially applied function:

user=> (def multiply-by-2 (partial * 2))

#'user/multiply-by-2

user=> (multiply-by-2 3)

6

user=> (map (partial * 2) [0 1 2 3 4 5])

(0 2 4 6 8 10)

Finally, imagine that you have a function that returns a sequence, such as using a regular expression to break a string into a sequence of words:

user=> (defn get-words [text] (re-seq #"\w+" text))

#'user/get-words

user=> (get-words "one two three four")

("one" "two" "three" "four")

As you would expect, mapping this function over a sequence of strings will give you a sequence of sequences:

user=> (map get-words ["one two three" "four five six" "seven eight nine"])

(("one" "two" "three") ("four" "five" "six") ("seven" "eight" "nine"))

If you want a single sequence that consists of all the subsequences concatenated, you can use mapcat:

user=> (mapcat get-words ["one two three" "four five six" "seven eight nine"])

("one" "two" "three" "four" "five" "six" "seven" "eight" "nine")

We now have all the tools we need to create our word-counting function.

Putting It All Together

Here’s count-words-sequential. Given a sequence of pages, it returns a map of the frequencies of the words on those pages:

FunctionalProgramming/WordCount/src/wordcount/core.clj

(defn count-words-sequential [pages]

(frequencies (mapcat get-words pages)))

It starts by converting the sequence of pages into a sequence of words with (mapcat get-words pages). This sequence of words is then passed to frequencies.

It’s worth comparing this to the imperative version in the code. Again, the functional solution turns out to be significantly simpler, clearer, and more concise than its imperative equivalent.

It’s Good to Be Lazy

Something might be bothering you—a Wikipedia dump runs to around 40 GiB. If count-words starts by collating every word into a single huge sequence, surely we’re going to end up running out of memory.

We don’t, and the reason for that is that sequences in Clojure are lazy—elements of a lazy sequence are generated only when they’re needed. Let’s see what this means in practice.

Clojure’s range function produces a sequence of numbers:

user=> (range 0 10)

(0 1 2 3 4 5 6 7 8 9)

In the preceding code, the REPL realizes (fully evaluates) the sequence and then prints it.

There’s nothing to stop you from realizing really big ranges, but doing so can turn your computer into an expensive room heater. Try the following, for example, and you’ll have to wait a long time before seeing a result (assuming that you don’t run out of memory first):

user=> (range 0 100000000)

Try this, on the other hand, and you’ll get the answer immediately:

user=> (take 10 (range 0 100000000))

(0 1 2 3 4 5 6 7 8 9)

Because take is only interested in the first ten elements of its sequence argument, range only needs to generate the first ten elements. This works across any level of function-call nesting:

user=> (take 10 (map (partial * 2) (range 0 100000000)))

(0 2 4 6 8 10 12 14 16 18)

We can even program with infinite sequences. Clojure’s iterate function, for example, generates an infinite sequence by repeatedly applying a function to an initial value, then the returned value, and so on:

user=> (take 10 (iterate inc 0))

(0 1 2 3 4 5 6 7 8 9)

user=> (take 10 (iterate (partial + 2) 0))

(0 2 4 6 8 10 12 14 16 18)

One final aspect of lazy sequences is that not only do we not need to generate the elements at the end of a sequence until we need them (which might be never), but we can discard the elements at the front if we’ve finished with them (if we don’t “hold on to our head”). The following, for example, might take a while to complete, but you won’t run out of memory:

user=> (take-last 5 (range 0 100000000))

(99999995 99999996 99999997 99999998 99999999)

Because the sequence of pages returned by get-pages is lazy, count-words can handle a 40 GiB Wikipedia dump without problem. And the bonus is that it’s very easy to parallelize, as we’ll see tomorrow.

Day 1 Wrap-Up

That brings us to the end of day 1. In day 2 we’ll parallelize our word count and look into fold in more detail.

What We Learned in Day 1

Concurrent programming in imperative languages is difficult because of the prevalence of shared mutable state. Functional programming makes concurrency easier and safer by eliminating shared mutable state. We saw how to do the following:

· Apply a function to every element of a sequence with map or mapcat

· Use laziness to handle large, or even infinite, sequences

· Reduce a sequence to a single (possibly complex) value with reduce

· Parallelize a reduce operation with fold

Day 1 Self-Study

Find

· The Clojure “cheat sheet,” which contains a quick reference to the most commonly used functions

· The documentation for lazy-seq, which enables you to create your own lazy sequences

Do

· Unlike many functional languages, Clojure does not provide tail-call elimination, so idiomatic Clojure makes very little use of recursion. Rewrite the recursive-sum function (see the code) to use Clojure’s loop and recur special forms instead.

· Rewrite reduce-sum (see the code) to use the #() reader macro instead of (fn ...).

Day 2: Functional Parallelism

Today we’ll continue our discussion of how functional programming helps with parallelism by looking at fold in more detail. But before we do that, we’ll look at parallelizing our Wikipedia word count.

One Page at a Time

In day 1 we saw that the map function creates a sequence by applying a function to each element of an input sequence in turn. But there’s no reason this has to happen serially—Clojure’s pmap function operates just like map, except that the function is applied in parallel. It’s semi-lazy, in that the parallel computation stays ahead of the consumption, but it won’t realize the entire result unless required.

We could, for example, convert our sequence of Wikipedia pages to a sequence of maps of word counts within those pages in parallel with this:

(pmap #(frequencies (get-words %)) pages)

In this case, we’re defining the function passed to pmap using the #(…) reader macro, which is a shorthand way to write an anonymous function. Arguments are specified with %1, %2, and so on, which can be further shortened to a single % if it takes only a single argument:

#(frequencies (get-words %))

The preceding code is equivalent to this:

(fn [page] (frequencies (get-words page)))

Here it is in action:

wordcount.core=> (def pages ["one potato two potato three potato four"

#_=> "five potato six potato seven potato more"])

#'wordcount.core/pages

wordcount.core=> (pmap #(frequencies (get-words %)) pages)

({"one" 1, "potato" 3, "two" 1, "three" 1, "four" 1}

{"five" 1, "potato" 3, "six" 1, "seven" 1, "more" 1})

We can then get the total word counts we’re looking for by reducing this sequence to a single map. Our reducing function will need to take two maps and merge them so that

· the keys in the resulting map are the union of the keys in the two input maps, and;

· if the key exists in only one of the two input maps, that value is associated with the key in the result map, or;

· if the key exists in both of the input maps, the value associated with that key is the sum of the values from the two input maps.

The following diagram shows what we’re aiming for:

images/FoldWordCount

We could write this reducing function ourselves, but (as is so often the case) a function in the standard library does what we need. Here’s the documentation:

(merge-with f & maps)

Returns a map that consists of the rest of the maps conj-ed onto the first—if a key occurs in more than one map, the mapping(s) from the latter (left-to-right) will be combined with the mapping in the result by calling (f val-in-result val-in-latter).

Recall that partial returns a partially applied function, so (partial merge-with +) will give us a function that takes two maps and merges them using + to combine values if the same key appears in both:

user=> (def merge-counts (partial merge-with +))

#'user/merge-counts

user=> (merge-counts {:x 1 :y 2} {:y 1 :z 1})

{:z 1, :y 3, :x 1}

Putting this all together, here’s a parallel word count:

FunctionalProgramming/WordCount/src/wordcount/core.clj

(defn count-words-parallel [pages]

(reduce (partial merge-with +)

(pmap #(frequencies (get-words %)) pages)))

Now that we’ve got a parallel word count, let’s see how well it performs.

Batching for Performance

On my MacBook Pro, the sequential version takes 140 seconds to count the words in the first 100,000 pages of Wikipedia. The parallel version takes 94 s, a 1.5x speedup. So we’re getting some performance benefit from parallelism, but not as much as we might like.

The reason is exactly the same as we saw last week in our threads and locks–based solution (see the code). We’re counting and merging on a page-by-page basis, which results in a large number of merges. We can reduce those merges by counting batches of pages instead of a single page at a time:

images/FoldSeq


Figure 5. Batched word count

Here, for example, is an implementation of word-count that processes batches of 100 pages at a time:

FunctionalProgramming/WordCount/src/wordcount/core.clj

(defn count-words [pages]

(reduce (partial merge-with +)

(pmap count-words-sequential (partition-all 100 pages))))

This uses partition-all, which batches (partitions) a sequence into multiple sequences:

user=> (partition-all 4 [1 2 3 4 5 6 7 8 9 10])

((1 2 3 4) (5 6 7 8) (9 10))

We then count the words within each batch with word-count-sequential and merge them as before. And sure enough, this version counts the words in the first 100,000 pages of Wikipedia in forty-four seconds, a 3.2x speedup.

Reducers

In day 1 we saw that switching from reduce to fold could deliver dramatic performance improvements. To understand how fold achieves this, we need to understand Clojure’s reducers library.

A reducer is a recipe that describes how to reduce a collection. The normal version of map takes a function and a (possibly lazy) sequence and returns another (possibly lazy) sequence:

user=> (map (partial * 2) [1 2 3 4])

(2 4 6 8)

Given the same arguments, the version from clojure.core.reducers, in contrast, returns a reducible:

user=> (require '[clojure.core.reducers :as r])

nil

user=> (r/map (partial * 2) [1 2 3 4])

#<reducers$folder$reify__1599 clojure.core.reducers$folder$reify__1599@151964cd>

A reducible isn’t a directly usable value—it’s just something that can subsequently be passed to reduce:

user=> (reduce conj [] (r/map (partial * 2) [1 2 3 4]))

[2 4 6 8]

The anonymous function we’re passing to reduce in the preceding code takes a collection as its first argument (initially an empty vector, []) and uses conj to add its second argument to it. The result, therefore, is a collection representing the result of the mapping.

The following is equivalent to the preceding code because into uses reduce internally:

user=> (into [] (r/map (partial * 2) [1 2 3 4]))

[2 4 6 8]

As well as map and mapcat, which we’ve already seen, there are reducer versions of most of the sequence-handling functions in clojure.core. And just like their clojure.core equivalents, they can be chained:

user=> (into [] (r/map (partial + 1) (r/filter even? [1 2 3 4])))

[3 5]

A reducer, instead of returning a result, returns a recipe for creating a result—a recipe that isn’t executed until it’s passed to either reduce or fold. This has two primary benefits:

· It’s more efficient than a chain of functions returning lazy sequences, because no intermediate sequences need to be created.

· It allows fold to parallelize the entire chain of operations on the underlying collection.

Reducers’ Internals

To understand how reducers work, we’re going to create our own slightly simplified, but still very effective, version of clojure.core.reducers. To do so, we first need to know about Clojure’s protocols. A protocol is very similar to an interface in Java—it’s a collection of methods that together define an abstraction. Clojure’s collections support reduce via the CollReduce protocol:

(defprotocol CollReduce

(coll-reduce [coll f] [coll f init]))

CollReduce defines a single function called coll-reduce with multiple arities—it can take either two arguments (coll and f) or three (coll, f, and init). The first argument performs the same role as Java’s this reference, allowing polymorphic dispatch. Look at this Clojure code:

(coll-reduce coll f)

This Clojure code is equivalent to this Java:

coll.collReduce(f);

The reduce function simply calls through to coll-reduce, delegating the task of reducing to the collection itself. We can see this by implementing our own version of reduce:

FunctionalProgramming/Reducers/src/reducers/core.clj

(defn my-reduce

([f coll] (coll-reduce coll f))

([f init coll] (coll-reduce coll f init)))

This shows a feature of defn we’ve not seen before—it can be used to define functions that take varying numbers of arguments (in this case either two or three). In both cases, it simply forwards its arguments to coll-reduce. Let’s prove that it works:

reducers.core=> (my-reduce + [1 2 3 4])

10

reducers.core=> (my-reduce + 10 [1 2 3 4])

20

Next, let’s see how to implement our own version of map:

FunctionalProgramming/Reducers/src/reducers/core.clj

(defn make-reducer [reducible transformf]

(reify

CollReduce

(coll-reduce [_ f1]

(coll-reduce reducible (transformf f1) (f1)))

(coll-reduce [_ f1 init]

(coll-reduce reducible (transformf f1) init))))

(defn my-map [mapf reducible]

(make-reducer reducible

(fn [reducef]

(fn [acc v]

(reducef acc (mapf v))))))

We’re using a function called make-reducer that takes a reducible and a transform function and returns a reification of the CollReduce protocol. Reifying a protocol is similar to using new in Java to create an anonymous instance of an interface.

This instance of CollReduce calls the coll-reduce method on the reducible, using the transform function to transform its f1 argument.

Joe asks:

Joe asks:

What Does an Underscore Mean?

It’s common Clojure style to use an underscore (“_”) as the name of an unused function parameter. We could have written this:

(coll-reduce [this f1]

(coll-reduce reducible (transformf f1) (f1)))

But using the underscore makes it clear that this is unused.

The transform function that’s passed to make-reducer is a function that takes a function as an argument and returns a transformed version of that function. Here’s my-map’s transformation function:

(fn [reducef]

(fn [acc v]

(reducef acc (mapf v))))

Recall that the reducing function is called once for each element of the collection, with an accumulator (acc) as its first argument and the value from the collection (v) as its second. So, given a reducing function reducef, we’re returning a function that calls reducef with its second argument modified by mapf (the function passed to my-map). Let’s prove that this works as expected:

reducers.core=> (into [] (my-map (partial * 2) [1 2 3 4]))

[2 4 6 8]

reducers.core=> (into [] (my-map (partial + 1) [1 2 3 4]))

[2 3 4 5]

As you would hope, we can also chain multiple mappings:

reducers.core=> (into [] (my-map (partial * 2) (my-map (partial + 1) [1 2 3 4])))

[4 6 8 10]

If you work through what this is doing, you’ll see that it performs a single reduce, using a single reducing function created by composing (partial * 2) with (partial + 1).

We’ve seen how reducers support reduce. Next, we’ll look at how fold parallelizes reductions.

Divide and Conquer

Instead of reducing a collection serially, fold uses a binary chop. It starts by dividing the collection into two halves, then halving those halves, and so on until the collection has been divided into groups that are smaller than some limit (by default, 512). It then runs a set of sequential reduce operations over each group and combines the results pairwise until only a single result is left. This results in a binary tree like Figure 6, A fold Tree.

images/FoldTree


Figure 6. A fold Tree

The reduce and combine operations run in parallel because fold creates a matching tree of parallel tasks (using Java 7’s Fork/Join framework). The tasks at the leaves of the tree run the reduce operations. The tasks at the next level wait for the results of those reduce operations to be ready and, when they are, it combines them and the process continues until only a single result is left at the root of the tree.

If fold is given a single function (+ in our example), that function is used for both the reduce and combine operations. As we’ll soon see, however, it sometimes makes sense to use one function for the reductions and another for the combinations.

Supporting Fold

In addition to CollReduce, collections that can be folded also support CollFold:

(defprotocol CollFold

(coll-fold [coll n combinef reducef]))

Just as reduce delegates to coll-reduce, fold delegates to coll-fold:

FunctionalProgramming/Reducers/src/reducers/core.clj

(defn my-fold

([reducef coll]

(my-fold reducef reducef coll))

([combinef reducef coll]

(my-fold 512 combinef reducef coll))

([n combinef reducef coll]

(coll-fold coll n combinef reducef)))

The two- and three-argument versions just call my-fold recursively, providing defaults for combinef and n if they’re not provided. The four-argument version calls the collection’s coll-fold implementation.

The only modification we need to make to our code to support parallel fold operations is to have make-reducer reify CollFold in addition to CollReduce:

FunctionalProgramming/Reducers/src/reducers/core.clj

(defn make-reducer [reducible transformf]

(reify

*

CollFold

*

(coll-fold [_ n combinef reducef]

*

(coll-fold reducible n combinef (transformf reducef)))

CollReduce

(coll-reduce [_ f1]

(coll-reduce reducible (transformf f1) (f1)))

(coll-reduce [_ f1 init]

(coll-reduce reducible (transformf f1) init))))

The implementation is very similar to CollReduce—we transform the reducing function and pass the rest of the arguments through to coll-fold. Let’s prove that it works as expected:

reducers.core=> (def v (into [] (range 10000)))

#'reducers.core/v

reducers.core=> (my-fold + v)

49995000

reducers.core=> (my-fold + (my-map (partial * 2) v))

99990000

Next, we’ll see an example of passing a different reduce and combine function to fold.

Frequencies with Fold

Our old friend, the frequencies function, is an excellent example of requiring different reduce and combine functions when implemented with fold:

FunctionalProgramming/Reducers/src/reducers/parallel_frequencies.clj

(defn parallel-frequencies [coll]

(r/fold

(partial merge-with +)

(fn [counts x] (assoc counts x (inc (get counts x 0))))

coll))

This should remind you strongly of the batched parallel implementation of word-count we saw earlier today (see the code)—each batch is reduced to a map that is then merged with (partial merge-with +).

We can’t try this out on our Wikipedia page count, because fold doesn’t work on a lazy sequence (there’s no way to perform a binary chop on a lazy sequence). But we can check that it works on, say, a large vector of random integers.

The repeatedly function creates an infinite lazy sequence by repeatedly calling the function it’s given as an argument. In this case, we’re using it to call rand-int, which returns a different random integer each time it’s called:

user=> (take 10 (repeatedly #(rand-int 10)))

(2 6 2 8 8 5 9 2 5 5)

We can use this to create a large vector of random integers as follows:

reducers.core=> (def numbers (into [] (take 10000000 (repeatedly #(rand-int 10)))))

#'reducers.core/numbers

And then we can count the occurrences of each number in that vector with frequencies and parallel-frequencies:

reducers.core=> (require ['reducers.parallel-frequencies :refer :all])

nil

reducers.core=> (time (frequencies numbers))

"Elapsed time: 1500.306 msecs"

{0 1000983, 1 999528, 2 1000515, 3 1000283, 4 997717, 5 1000101, 6 999993, …

reducers.core=> (time (parallel-frequencies numbers))

"Elapsed time: 436.691 msecs"

{0 1000983, 1 999528, 2 1000515, 3 1000283, 4 997717, 5 1000101, 6 999993, …

So the sequential version of frequencies takes around 1500 ms, and the parallel version a little over 400 ms, a 3.5x speedup.

Day 2 Wrap-Up

That brings us to the end of day 2 and our discussion of parallelism in Clojure. Tomorrow we’ll move on to concurrency with futures and promises, and we’ll see how they enable the dataflow style of programming.

What We Learned in Day 2

Clojure allows operations on sequences to be easily and naturally parallelized.

· A map operation can be parallelized with pmap, yielding a semi-lazy parallel map.

· Such a parallel map can be batched for efficiency with partition-all.

· Alternatively, fold parallelizes reduce operations with an eager divide-and-conquer strategy.

· Instead of returning an intermediate sequence, the clojure.core.reducers versions of functions like map, mapcat, and filter return reducibles, which can be thought of as recipes for how to reduce a sequence.

Day 2 Self-Study

Find

· The video of Rich Hickey presenting reducers at QCon 2012

· The documentation for pcalls and pvalues—how do they differ from pmap? Is it possible to implement them in terms of pmap?

Do

· Create my-flatten and my-mapcat along the lines of my-map (see the code). Note that these will both be more complex than my-map because they will need to expand a single input sequence element into one or more elements of the resulting sequence. If you get stuck, see the implementation in the code that accompanies this book.

· Create my-filter. Again, this will be more complex than my-map because it will need to reduce the number of elements in the input sequence.

Day 3: Functional Concurrency

Over the previous two days, we’ve concentrated on parallelism. Today we’re going to change focus and look at concurrency. But before we do so, we’ll look deeper into why functional programming allows us to parallelize code so easily.

Same Structure, Different Evaluation Order

A common theme runs through everything we’ve seen over the last couple of days—functional programming allows us to play games with the order in which things are evaluated. If two calculations are independent, we can run them in any order we like, including in parallel.

The following code snippets all perform the same calculation, return the same result, and have almost identical structure, but they execute their component operations in very different orders:

(reduce + (map (partial * 2) (range 10000)))

Reduces a lazy sequence built on top of a lazy sequence—elements in each lazy sequence are generated on an as-needed basis.

(reduce + (doall (map (partial * 2) (range 10000))))

First generates the entirety of the mapped sequence (doall forces a lazy sequence to be fully realized) and then reduces it.

(reduce + (pmap (partial * 2) (range 10000)))

Reduces a semi-lazy sequence, which is generated in parallel.

(reduce + (r/map (partial * 2) (range 10000)))

Reduces a single lazy sequence with a single reducing function constructed by combining + with (partial * 2).

(r/fold + (r/map (partial * 2) (into [] (range 10000))))

Generates the entirety of the range first and then reduces that in parallel by creating a tree of reduce and combine operations.

In an imperative language like Java, the order in which things happen is tightly bound to the order in which statements appear in the source code. The compiler and runtime can move things around somewhat (something we have to be careful of when using threads and locks, as we saw inMysterious Memory), but broadly speaking things happen in the same order as we write them down.

Functional languages have a much more declarative feel. Instead of writing a set of instructions for how to perform an operation, a functional program is more a statement of what the result should be. How the various calculations are ordered to achieve that result is much more fluid—this freedom to reorder calculations is what allows functional code to be parallelized so easily.

In the next section we’ll see why functional languages can play these kinds of games with evaluation order and why imperative languages cannot.

Referential Transparency

Pure functions are referentially transparent—anywhere an invocation of the function appears, we can replace it with its result without changing the behavior of the program. Look at this example:

(+ 1 (+ 2 3))

This is exactly equivalent to the following:

(+ 1 5)

Indeed, one way to think about what executing functional code means is to think of it as repeatedly replacing function invocations with their results until you reach the final result. For example, we could evaluate (+ (+ 1 2) (+ 3 4)) like this:

(+ (+ 1 2) (+ 3 4)) → (+ (+ 1 2) 7) → (+ 3 7) → 10

Or like this:

(+ (+ 1 2) (+ 3 4)) → (+ 3 (+ 3 4)) → (+ 3 7) → 10

Of course, the same is true for the + operator in Java, but in a functional program every function is referentially transparent. It is this fact that enables us to safely make the radical changes to evaluation order we’ve seen so far.

Joe asks:

Joe asks:

But Isn’t Clojure Impure?

As we’ll see in the next chapter, Clojure is an impure functional language—it is possible to write functions with side effects in Clojure, and any such functions will not be referentially transparent.

This turns out to make little difference in practice because side effects are both very rare in idiomatic Clojure code and obvious when they do exist. There are a few simple rules about where side effects can safely appear, and as long as you follow those rules you’re unlikely to hit problems with evaluation order.

Dataflow

It’s interesting to think about how data flows between functions. Here is a graph of the data flows within (+ (+ 1 2) (+ 3 4)):

images/DataFlow

There are no dependencies between (+ 1 2) and (+ 3 4), so these two evaluations could theoretically happen in any order, including concurrently with each other. The final addition, however, can’t take place until the results of both the subcalculations are available.

Theoretically, the language runtime could start at the left side of this graph and “push” data toward the right side. Whenever the data associated with a function’s inputs becomes available, that function is executed. And each function could (theoretically, at least) execute concurrently. This style of execution is called dataflow programming. Clojure allows us to use this execution strategy through futures and promises.

Futures

A future takes a body of code and executes it in another thread. Its return value is a future object:

user=> (def sum (future (+ 1 2 3 4 5)))

#'user/sum

user=> sum

#<core$future_call$reify__6110@5d4ee7d0: 15>

We can retrieve the value of a future by dereferencing it with either deref or the shorthand @:

user=> (deref sum)

15

user=> @sum

15

Dereferencing a future will block until the value is available (or realized). We can use this to create exactly the dataflow graph we saw before:

user=> (let [a (future (+ 1 2))

#_=> b (future (+ 3 4))]

#_=> (+ @a @b))

10

In that code, we’re using let to bind a to (future (+ 1 2)) and b to (future (+ 3 4)). The evaluation of (+ 1 2) takes place in one thread and (+ 3 4) in another. Finally, the outer addition blocks until both the inner additions have completed.

Of course, it makes no sense to use futures for such tiny operations as adding two numbers—we’ll see a more realistic example soon. Before then, we’ll look at Clojure’s promises.

Promises

A promise is very similar to a future in that it’s a value that’s realized asynchronously and accessed with deref or @, which will block until it’s realized. The difference is that creating a promise does not cause any code to run—instead its value is set with deliver. Here’s a REPL session that illustrates this:

user=> (def meaning-of-life (promise))

#'user/meaning-of-life

user=> (future (println "The meaning of life is:" @meaning-of-life))

#<core$future_call$reify__6110@224e59d9: :pending>

user=> (deliver meaning-of-life 42)

#<core$promise$reify__6153@52c9f3c7: 42>

The meaning of life is: 42

We start by creating a promise called meaning-of-life and then use future to create a thread that prints its value (using future to create a thread like this is a common Clojure idiom). Finally we use deliver to set the value of our promise, which unblocks the thread we created earlier.

Now that we’ve seen how futures and promises work, let’s see them in a real application.

A Functional Web Service

We’re going to create a web service that accepts real-time transcript data (for example, the transcript of a television program) and translates it. The transcript is divided into “snippets,” where each snippet has a sequence number. Here, for example, is how the first stanza of Lewis Carroll’s poem Jabberwocky (from Through the Looking-Glass) might be divided into snippets:

0

Twas brillig, and the slithy toves

1

Did gyre and gimble in the wabe:

2

All mimsy were the borogoves,

3

And the mome raths outgrabe.

To deliver snippet 0 to our web service, we make a PUT request to /snippet/0 with the body, “Twas brillig, and the slithy toves.” Snippet 1 is delivered to /snippet/1, and so on.

This is a very simple API, but it’s not as simple to implement as it might at first appearance. First, because it’s going to run within a concurrent web server, our code will need to be thread-safe. Second, networks being what they are, it will have to handle snippets being lost and retried, delivered more than once, and arriving out of order.

If we want to process snippets sequentially (independent of the order they arrive in), we’re going to have to keep track of which snippets we’ve already received and which we’ve processed. And, whenever we receive a new snippet, we’ll need to check to see which (if any) are now available to be handled. Implementing this sequentially isn’t easy—we’re going to show how concurrency can be used to create a simple solution.

Figure 7, The Structure of the Transcript Handler shows the structure of the solution we’re heading for.

images/TranscriptHandler


Figure 7. The Structure of the Transcript Handler

On the left are the threads created by our web server to handle incoming requests. On the right is a thread that processes incoming snippets sequentially, waiting until the next snippet is available. In the next section we’ll talk about snippets, the data structure that mediates the communication between these threads.

Accepting Snippets

Here’s how we’re going to keep track of the snippets we’ve received:

FunctionalProgramming/TranscriptHandler/src/server/core.clj

(def snippets (repeatedly promise))

So snippets is an infinite lazy sequence of promises (insert your own software-engineer-versus-sales, infinite-sequence-of-undelivered-promises-related joke here). These promises are realized by accept-snippet when snippets become available:

FunctionalProgramming/TranscriptHandler/src/server/core.clj

(defn accept-snippet [n text]

(deliver (nth snippets n) text))

To handle snippets sequentially, we simply need to create a thread that dereferences each promise in turn. As an illustration, here’s one that simply prints out the value of each snippet as it becomes available:

FunctionalProgramming/TranscriptHandler/src/server/core.clj

(future

(doseq [snippet (map deref snippets)]

(println snippet)))

This uses doseq, which processes a sequence sequentially. In this case, the sequence it’s processing is a lazy sequence of dereferenced promises, each one of which is bound to snippet.

All that remains is to wire this all up into a web service. Here’s code that uses the Compojure library to do so:[11]

FunctionalProgramming/TranscriptHandler/src/server/core.clj

(defroutes app-routes

(PUT "/snippet/:n" [n :as {:keys [body]}]

(accept-snippet (edn/read-string n) (slurp body))

(response "OK")))

(defn -main [& args]

(run-jetty (site app-routes) {:port 3000}))

This defines a single PUT route that calls our accept-snippet function. We’re using an embedded Jetty web server[12]—like most web servers, Jetty is multithreaded, so our code needs to be thread-safe.

If we start the server (with lein run), we can use curl to prove to ourselves that this all works as we expect. Send snippet 0, for example:

$ curl -X put -d "Twas brillig, and the slithy toves" \

-H "Content-Type: text/plain" localhost:3000/snippet/0

OK

And it’s immediately printed:

Twas brillig, and the slithy toves

But nothing will be printed if we send snippet 2 before snippet 1 has been sent:

$ curl -X put -d "All mimsy were the borogoves," \

-H "Content-Type: text/plain" localhost:3000/snippet/2

OK

Send snippet 1, however:

$ curl -X put -d "Did gyre and gimble in the wabe:" \

-H "Content-Type: text/plain" localhost:3000/snippet/1

OK

And both it and snippet 2 are printed:

Did gyre and gimble in the wabe:

All mimsy were the borogoves,

Delivering a snippet more than once causes no problems, because deliver is a no-op if called on a promise that’s already been realized. So the following results in no error and nothing being printed:

$ curl -X put -d "Did gyre and gimble in the wabe:" \

-H "Content-Type: text/plain" localhost:3000/snippet/1

OK

Now that we’ve demonstrated that we can handle snippets, let’s do something more interesting with them. Imagine that we have another web service that translates any sentences it’s given. We’re going to modify our transcript handler to use this web service to translate whatever it’s given.

Sentences

Before we look at how to call our translation service, we first need to implement code to turn our sequence of snippets into a sequence of sentences. Sentence boundaries might appear anywhere within a snippet, so we might need to either split or join snippets to obtain sentences.

Let’s start by looking at how to split on sentence boundaries:

FunctionalProgramming/TranscriptHandler2/src/server/sentences.clj

(defn sentence-split [text]

(map trim (re-seq #"[^\.!\?:;]+[\.!\?:;]*" text)))

This passes a regular expression that matches sentences to re-seq, which returns a sequence of matches and uses trim to get rid of any extraneous spaces:

server.core=> (sentence-split "This is a sentence. Is this?! A fragment")

("This is a sentence." "Is this?!" "A fragment")

Next, a little more regular-expression magic gives us a function that allows us to tell whether a string is a sentence:

FunctionalProgramming/TranscriptHandler2/src/server/sentences.clj

(defn is-sentence? [text]

(re-matches #"^.*[\.!\?:;]$" text))

server.core=> (is-sentence? "This is a sentence.")

"This is a sentence."

server.core=> (is-sentence? "A sentence doesn't end with a comma,")

nil

Finally, we can wire this all up to create strings->sentences, a function that takes a sequence of strings and returns a sequence of sentences:

FunctionalProgramming/TranscriptHandler2/src/server/sentences.clj

(defn sentence-join [x y]

(if (is-sentence? x) y (str x " " y)))

(defn strings->sentences [strings]

(filter is-sentence?

(reductions sentence-join

(mapcat sentence-split strings))))

This makes use of reductions. As its name suggests, this behaves like reduce; but instead of returning a single value, it returns a sequence of each of the intermediate values:

server.core=> (reduce + [1 2 3 4])

10

server.core=> (reductions + [1 2 3 4])

(1 3 6 10)

In our case, we’re using sentence-join as the reducing function. If its first argument is a sentence, this just returns its second argument. But if its first argument is not, it returns the two concatenated (with an intervening space):

server.core=> (sentence-join "A complete sentence." "Start of another")

"Start of another"

server.core=> (sentence-join "This is" "a sentence.")

"This is a sentence."

So with reductions, this gives us the following:

server.core=> (def fragments ["A" "sentence." "And another." "Last" "sentence."])

#'server.core/fragments

server.core=> (reductions sentence-join fragments)

("A" "A sentence." "And another." "Last" "Last sentence.")

Finally, we filter the result with is-sentence?:

server.core=> (filter is-sentence? (reductions sentence-join fragments))

("A sentence." "And another." "Last sentence.")

Now that we’ve got a sequence of sentences, we can pass them to our translation server.

Translating Sentences

A classic use case for futures is talking to another web service. A future allows computation, such as network access, to take place on another thread while the main thread continues. Here’s translate, a function that returns a future that will, when realized, contain the translation of its argument:

FunctionalProgramming/TranscriptHandler2/src/server/core.clj

(def translator "http://localhost:3001/translate")

(defn translate [text]

(future

(:body (client/post translator {:body text}))))

This uses client/post from the clj-http library to make a POST request and retrieve the response.[13] We can use this to transform the result of the strings->sentences function we created earlier into a set of translations:

FunctionalProgramming/TranscriptHandler2/src/server/core.clj

(def translations

(delay

(map translate (strings->sentences (map deref snippets)))))

This introduces the delay function, which creates a lazy value that isn’t realized until it’s dereferenced.

Putting It All Together

Here’s the complete source of our web service:

FunctionalProgramming/TranscriptHandler2/src/server/core.clj

Line 1

(def snippets (repeatedly promise))

-

-

(def translator "http://localhost:3001/translate")

-

5

(defn translate [text]

-

(future

-

(:body (client/post translator {:body text}))))

-

-

(def translations

10

(delay

-

(map translate (strings->sentences (map deref snippets)))))

-

-

(defn accept-snippet [n text]

-

(deliver (nth snippets n) text))

15

-

(defn get-translation [n]

-

@(nth @translations n))

-

-

(defroutes app-routes

20

(PUT "/snippet/:n" [n :as {:keys [body]}]

-

(accept-snippet (edn/read-string n) (slurp body))

-

(response "OK"))

-

(GET "/translation/:n" [n]

-

(response (get-translation (edn/read-string n)))))

25

-

(defn -main [& args]

-

(run-jetty (wrap-charset (api app-routes)) {:port 3000}))

As well as the code to translate sentences, we’ve added a new GET endpoint to allow translations to be retrieved (line 23). This makes use of get-translation (line 16), which accesses the translations sequence we created earlier.

If you want to see this all in action, start the server, together with the translator server that’s included in the accompanying code. Then run the TranscriptTest application (also in the accompanying code) and you should see a sentence-by-sentence French translation of Jabberwocky:

$ lein run

Il brilgue, les tôves lubricilleux Se gyrent en vrillant dans le guave:

Enmîmés sont les gougebosqueux Et le mômerade horsgrave.

Garde-toi du Jaseroque, mon fils!

La gueule qui mord; la griffe qui prend!

Garde-toi de l'oiseau Jube, évite Le frumieux Band-à-prend!

...

So there we have it—a complete concurrent web service that uses a combination of laziness, futures, and promises. It has no mutable state and no locks, and it’s considerably simpler and easier to read than an equivalent service implemented in an imperative language is likely to be.

Joe asks:

Joe asks:

Aren’t We Holding Onto Our Head?

Our web service makes use of two lazy sequences, snippets and translations. In both cases, we hold on to the head of these sequences (see It’s Good to Be Lazy), meaning that they will grow forever. Over time, they will consume more and more memory.

In the next chapter we’ll see how to use Clojure’s reference types to fix this problem and enhance this web service to handle more than one transcript.

Day 3 Wrap-Up

This brings us to the end of day 3 and our discussion of how functional programming facilitates concurrency and parallelism.

What We Learned in Day 3

Functions in a functional program are referentially transparent. This allows us to safely modify the order in which those functions are called without affecting the behavior of the program. In particular, this facilitates the dataflow style of programming (supported in Clojure with futures and promises), in which code executes when the data it depends on becomes available. We saw an example of how concurrent dataflow programming can simplify the implementation of a web service.

Day 3 Self-Study

Find

· What is the difference between future and future-call? How would you implement one in terms of the other?

· How do you tell if a future has been realized without blocking? How do you cancel a future?

Do

· Modify the transcript server so that a GET request to /translation/:n doesn’t block if the translation isn’t yet available, but returns an HTTP 409 status code instead.

· Implement the transcript server in an imperative language. Is your solution as simple as the functional Clojure implementation? How confident are you that it contains no race conditions?

Wrap-Up

There’s a common misconception about parallelism—many people believe that parallel programming necessarily raises the specter of nondeterminism. If things aren’t proceeding sequentially, the reasoning goes, and we can no longer rely on effects happening in a specific order, we’re always going to have to worry about race conditions.

Certainly, there are some concurrent programs that will always be nondeterministic. And this is unavoidable—some problems require solutions that are intrinsically dependent on the details of timing. But it’s not the case that all parallel programs are necessarily nondeterministic. The value of the sum of the numbers between 0 and 10,000 won’t change just because we add those numbers together in parallel instead of sequentially. The frequencies of the words in a particular Wikipedia dump is and always will be the same, no matter how many threads we use to count them.

Most of the potential race conditions in traditional threads and locks–based parallel programs are accidental, arising from the details of the solution rather than any intrinsic nondeterminism in the problem.

Because functional code is referentially transparent, we can modify the order in which it’s executed, safe in the knowledge that we will not change the final result by doing so. This includes evaluating mutually independent functions in parallel—as we’ve seen, this allows us to parallelize functional code almost trivially easily.

Joe asks:

Joe asks:

Where Are the Monads and the Monoids?

Introductions to functional programming tend to involve descriptions of mathematical concepts like monads, monoids, and category theory. We’ve just had a whole chapter on functional programming, with no mention of any of these. What gives?

One of the biggest influences on the flavor of any programming language is its type system. Writing code in a statically typed language like Java or Scala feels very different from writing code in a dynamically typed language like Ruby or Python.

Static type systems place an up-front burden on the programmer to get the types right. The payoff is that doing so enables the compiler to make guarantees that certain types of errors will not occur at runtime and to improve efficiency by making optimizations guided by the type system.

A programmer using a dynamically typed language avoids this up-front burden but accepts the risks that some errors will happen at runtime and that compiled code may be less efficient.

The same distinction is present in the world of functional programming. A statically typed functional language like Haskell uses concepts like monads and monoids to allow its type system to accurately encode restrictions on where particular functions and values can be used and to keep track of side effects while remaining functional.

Although an understanding of these mathematical concepts is undoubtedly helpful when writing Clojure code, no static type system needs to be told about them. The downside is that this places an additional burden on the programmer to make sure that functions and values are used in appropriate contexts—the compiler won’t warn you if you fail to do so.

Strengths

The primary benefit of functional programming is confidence, confidence that your program does what you think it does. Once you’ve got into thinking functionally (which can take a while, especially if you have years of experience with imperative programming), functional programs tend to be simpler, easier to reason about, and easier to test than their imperative equivalents.

Once you have a working functional solution, referential transparency allows you to parallelize it, or operate in a concurrent environment, with very little effort. Because functional code eliminates mutable state, the majority of the concurrency bugs that can show up in traditional threads and locks--based programs are impossible.

Weaknesses

Many people expect that functional code will be less efficient than its imperative equivalent. Although there are performance implications for some types of problem, the penalty is likely to be less than you fear. And any small performance hit is likely to be more than worth it for the payoff of increased robustness and scalability.

Other Languages

Java 8 has recently added a number of features that make it easier to write code in a functional style, most notably lambda expressions and streams.[14][15] Streams support aggregate operations that can process streams in parallel in a manner very similar to Clojure’s reducers.

No description of functional programming would be complete without mentioning Haskell.[16] Haskell provides equivalents of everything we’ve seen in this chapter and more. For an excellent introduction to parallel and concurrent programming in Haskell, see Simon Marlow’s tutorial.[17]

Final Thoughts

There’s a great deal more to functional programming than we’ve seen in this chapter—above and beyond its excellent support for concurrency and parallelism. It seems inevitable that functional programming will play an increasingly important role in the future.

Having said that, mutable state is going to be with us for the foreseeable future. In the next chapter we’ll see how Clojure supports side effects without compromising its support for concurrency.

Footnotes

[9]

http://clojure.org

[10]

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4228335

[11]

https://github.com/weavejester/compojure

[12]

http://www.eclipse.org/jetty/

[13]

https://github.com/dakrone/clj-http

[14]

http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html

[15]

http://docs.oracle.com/javase/tutorial/collections/streams/index.html

[16]

http://haskell.org/

[17]

http://community.haskell.org/~simonmar/par-tutorial.pdf