Shift - Functional Thinking (2014)

Functional Thinking (2014)

Chapter 2. Shift

Learning a new programming language is easy: you merely learn the new syntax for familiar concepts. For example, if you decide to learn JavaScript, your first stop will be a resource that explains how JavaScript’s if statement works. Typically, developers learn new languages by applying what they know about existing languages. But learning a new paradigm is difficult—you must learn to see different solutions to familiar problems.

Writing functional code doesn’t require a shift to a functional programming language such as Scala or Clojure but rather a shift in the way you approach problems.

A Common Example

Once garbage collection became mainstream, it simultaneously eliminated entire categories of hard-to-debug problems and allowed the runtime to manage a process that is complex and error-prone for developers. Functional programming aims to do the same thing for the algorithms you write, allowing you to work at a higher level of abstraction while freeing the runtime to perform sophisticated optimizations. Developers receive the same benefits of lower complexity and higher performance that garbage collection provides, but at a more intimate level, in the way you devise solutions.

Imperative Processing

Imperative programming describes a style of programming modeled as a sequence of commands (imperatives) that modify state. A traditional for loop is an excellent example of the imperative style of programming: establish an initial state and execute a series of commands for each iteration of the loop.

To illustrate the difference between imperative and functional programming, I’ll start with a common problem and its imperative solution. Let’s say that you are given a list of names, some of which consist of a single character, and you are asked to return a comma-delimited string with the single letter names removed, with each name capitalized. Java code to implement this algorithm appears in Example 2-1.

Example 2-1. Typical company process (in Java)

package com.nealford.functionalthinking.trans;

importjava.util.List;

publicclassTheCompanyProcess {

public String cleanNames(List<String> listOfNames) {

StringBuilder result = new StringBuilder();

for(int i = 0; i < listOfNames.size(); i++) {

if (listOfNames.get(i).length() > 1) {

result.append(capitalizeString(listOfNames.get(i))).append(",");

}

}

return result.substring(0, result.length() - 1).toString();

}

public String capitalizeString(String s) {

return s.substring(0, 1).toUpperCase() + s.substring(1, s.length());

}

}

Because you must process the entire list, the easiest way to attack the problem in Example 2-1 is within an imperative loop. For each name, I check to see if its length is greater than the disallowed single character, then append the capitalized name onto result, along with a trailing comma. The last name in the final string shouldn’t include the comma, so I strip it off the final return value.

Imperative programming encourages developers to perform operations within loops. In this case, I do three things: filter the list to eliminate single characters, transform the list to capitalize each name, then convert the list into a single string. For now, I’ll call these three operations Useful Things to do to a list. In an imperative language, I must use the same low-level mechanism (iteration over the list) for all three types of processing. Functional languages offer specific helpers for these operations.

Functional Processing

Functional programming describes programs as expressions and transformations, modeling mathematical formulas, and tries to avoid mutable state. Functional programming languages categorize problems differently than imperative languages. The logical categories listed earlier (filter,transform, and convert) are represented as functions that implement the low-level transformation but rely on the developer to customize the low-level machinery with a higher-order function, supplied as one of the parameters. Thus, I could conceptualize the problem as the pseudocode inExample 2-2.

Example 2-2. Pseudocode for the “company process”

listOfEmps

-> filter(x.length > 1)

-> transform(x.capitalize)

-> convert(x + "," + y)

Functional languages allow you to model this conceptual solution without worrying about the details.

Consider the company process example from Example 2-1 implemented in Scala, shown in Example 2-3.

Example 2-3. Processing functionally in Scala

val employees =List("neal", "s", "stu", "j", "rich", "bob", "aiden", "j", "ethan",

"liam", "mason", "noah", "lucas", "jacob", "jayden", "jack")

val result = employees

.filter(_.length() > 1)

.map(_.capitalize)

.reduce(_ + "," + _)

The use of Scala in Example 2-3 reads much like the pseudocode in Example 2-2, with necessary implementation details. Given the list of names, I first filter it, eliminating single characters. The output of that operation is then fed into the map function, which executes the supplied code block on each element of the collection, returning the transformed collection. Finally, the output collection from map flows to the reduce() function, which combines each element based on the rules supplied in the code block. In this case, to combine the first two elements, I concatenate them with a comma. In all three of these small functions, I don’t care what the parameters are named, so Scala allows me to skip the names and use an underscore instead. In the case of reduce(), I still pass two parameters, which is the expected signature even though I’m using the same generic indicator, the underscore.

I chose Scala as the first language to show this implementation because of its somewhat familiar syntax and the fact that Scala uses industry-consistent names for these concepts. In fact, Java 8 has these same features, and closely resembles the Scala version, as shown in Example 2-4.

Example 2-4. Java 8 version of the Company Process

public String cleanNames(List<String> names) {

if (names == null) return "";

return names

.stream()

.filter(name -> name.length() > 1)

.map(name -> capitalize(name))

.collect(Collectors.joining(","));

}

private String capitalize(String e) {

return e.substring(0, 1).toUpperCase() + e.substring(1, e.length());

}

In Example 2-4, I use the collect() method rather than reduce() because it is more efficient with the Java String class; collect() is a special case for reduce() in Java 8. Otherwise, it reads remarkably similarly to the Scala code in Example 2-3.

If I was concerned that some of the items in the list might be null, I can easily add another criterium to the stream:

return names

.stream()

.filter(name -> name != null)

.filter(name -> name.length() > 1)

.map(name -> capitalize(name))

.collect(Collectors.joining(","));

The Java runtime is intelligent enough to combine the null check and length filter into a single operation, allowing you to express the idea succinctly yet still have performant code.

Groovy has these features but names them more consistently than scripting languages such as Ruby. The Groovy version of the “company process” from Example 2-1 appears in Example 2-5.

Example 2-5. Processing in Groovy

publicstatic String cleanUpNames(listOfNames) {

listOfNames

.findAll { it.length() > 1 }

.collect { it.capitalize() }

.join ','

}

While Example 2-5 is structurally similar to the Scala code in Example 2-3, the method names and substitution identifier differ. Groovy’s findAll on a collection applies the supplied code block and keeps the ones that yield true. Like Scala, Groovy allows developers to shortcut writing code blocks with single parameters; the Groovy substitution mechanism is the implicit it keyword, which represents the single parameter. The collect method, Groovy’s version of map, executes the supplied code block on each element of the collection. Groovy supplies a function (join()) that accepts a collection of strings and concatenates them into a single string, using the supplied delimiter, which is precisely what I need.

Clojure is a functional language and uses the more traditional function names, as shown in Example 2-6.

Example 2-6. Processing in Clojure

(defn process [list-of-emps]

(reduce str (interpose ","

(map s/capitalize (filter #(< 1 (count %)) list-of-emps)))))

Unless you are accustomed to reading Clojure, the structure of the code in Example 2-6 might be unclear. Lisps such as Clojure work “inside out,” so the place to start is the final parameter value list-of-emps. Clojure’s (filter a b) function accepts two parameters: a function to use for filtering (in this case, an anonymous function) and the collection to filter. You can write a formal function definition for the first parameter such as (fn [x] (< 1 (count x))), but Clojure allows you to write anonymous functions more tersely as #(< 1 (count %)). As in the previous examples, the outcome of the filtering operation is a smaller collection.

The (map a b) function accepts the transformation function as the first parameter and the collection second, which in this case is the return value of the (filter ) operation. map’s first parameter can be a custom function, but any function that accepts a single parameter will work, and the built-in capitalize function matches the requirement. Finally, the result of the (map ) operation becomes the collection parameter for (reduce ), whose first parameter is the combination (str ) function applied to the return of the (interpose ) function, which inserts its first parameter between each element of the collection (except the last).

Even experienced developers suffer when functionality becomes too nested like this. Fortunately, Clojure includes macros that allow you to “unwind” structures like this into more readable order. Consider Example 2-7, which is functionally identical to the Example 2-6 version.

Example 2-7. Improved readability via the thread-last macro

(defn process2 [list-of-emps]

(->> list-of-emps

(filter #(< 1 (count %)))

(map s/capitalize)

(interpose ",")

(reduce str)))

The thread-last (->>) macro takes the very common operation of applying a variety of transformations on collections and reverses the typical Lisp ordering, restoring a more natural left-to-right reading. In Example 2-7, the collection (list-of-emps) appears first. Each subsequent form in the block is applied to the previous one. One of the strengths of Lisp lies in its syntactic flexibility: any time something becomes difficult to read, bend the syntax back toward readability.

All these languages include key concepts from functional programming. Part of transitioning to functional thinking is learning where to apply these higher-level abstractions and stop going immediately for detailed implementations.

What are the benefits of thinking at a higher level of abstraction? First, it encourages you to categorize problems differently, seeing commonalities. Second, it allows the runtime to be more intelligent about optimizations. In some cases, reordering the work stream makes it more efficient (for example, processing fewer items) if it doesn’t change the ultimate outcome. Third, it allows solutions that aren’t possible when the developer is elbow deep in the details of the engine. For example, consider the amount of work required to make the Java code in Example 2-1 run across multiple threads. Because you control the low-level details of iteration, you must weave the thread code into yours. In the Scala version, I can make the code parallel by adding par to the stream, as shown in Example 2-8.

Example 2-8. Scala processing in parallel

val parallelResult = employees

.par

.filter(_.length() > 1)

.map(_.capitalize)

.reduce(_ + "," + _)

I can make an almost identical change to the Java 8 version to achieve the same effect, as shown in Example 2-9.

Example 2-9. Java 8 parallel processing

public String cleanNamesP(List<String> names) {

if (names == null) return "";

return names

.parallelStream()

.filter(n -> n.length() > 1)

.map(e -> capitalize(e))

.collect(Collectors.joining(","));

}

Clojure has a similar drop-in replacement for common collection transformations that makes them seamlessly parallel. Working at a higher level of abstraction allows the runtime to optimize low-level details. Writing an industrial-strength virtual machine with garbage collection is an extraordinarily complex task, and developers gladly cede those responsibilities. JVM engineers improved developers’ lives by mostly encapsulating garbage collection and eliminating it from daily consideration.

The same dual benefit exists for functional operations such as map, reduce, and filter. An excellent example is the Reducers Library in Clojure. By making a library extension to the Clojure language, its creator Rich Hickey provided new versions of vector and map (as well as a newfold function that works with existing vectors and maps) that use the underlying Java Fork/Join library to provide parallel processing of collections. One of Clojure’s selling points is that it removes concurrency as a developer concern just as Java removed garbage collection. The fact that Clojure developers use map instead of iteration means that they automatically receive upgraded abilities.

NOTE

Focus on results over steps.

Stop thinking about the low-level details of how iteration, transformation, and reduction work, and start noticing the prevalence of problems in those shapes.

As another example of transforming an imperative solution to a functional one, consider the problem of perfect numbers and number classification.

Case Study: Number Classification

The Greek mathematican Nicomachus devised a classification scheme for natural numbers, identifying each as belonging uniquely to the categories of abundant, perfect, or deficient. A perfect number equals the sum of its positive divisors—the pairs of numbers whose product yields the target number, excluding the number itself. For example, 6 is a perfect number because its divisors are 1, 2, 3, and 6 = 1 + 2 + 3; similarly, 28 is perfect because 28 = 1 + 2 + 4 + 7 + 14. The definition of perfect numbers delineates the classification scheme shown in Table 2-1.

Table 2-1. Integer classification scheme

Perfect

Sum of factors = number

Abundant

Sum of factors > number

Deficient

Sum of factors < number

One additional mathematics concept assists in the implementation: the aliquot sum, which is defined as the sum of the factors of a number not including the number itself, which is nominally one of the factors. Using an aliquot sum rather than the proper sum of the factors makes the comparisons for perfection easier: aliquotSum == number rather than sum - number == number.

Imperative Number Classification

For the implementation, I know that it’s likely that several classifications will take place per number. Given these requirements, a Java solution appears in Example 2-10.

Example 2-10. Number classifier in Java

importjava.util.HashMap;

importjava.util.HashSet;

importjava.util.Map;

importjava.util.Set;

publicclassImpNumberClassifierSimple {

privateint _number; 1

private Map<Integer, Integer> _cache; 2

public ImpNumberClassifierSimple(int targetNumber) {

_number = targetNumber;

_cache = new HashMap<>();

}

publicboolean isFactor(int potential) {

return _number % potential == 0;

}

public Set<Integer> getFactors() {

Set<Integer> factors = new HashSet<>();

factors.add(1);

factors.add(_number);

for (int i = 2; i < _number; i++)

if (isFactor(i))

factors.add(i);

return factors;

}

publicint aliquotSum() { 3

if (_cache.get(_number) == null) {

int sum = 0;

for (int i : getFactors())

sum += i;

_cache.put(_number, sum - _number);

}

return _cache.get(_number);

}

publicboolean isPerfect() {

return aliquotSum() == _number;

}

publicboolean isAbundant() {

return aliquotSum() > _number;

}

publicboolean isDeficient() {

return aliquotSum() < _number;

}

}

1

Internal state to hold the classification target number

2

Internal cache to prevent recalcuating the sum unnecessarily

3

Calculation of the aliquotSum, the sum of factors minus the number itself

In the ImpNumberClassifierSimple class in Example 2-10, two elements of internal state exist. The number field allows me to avoid passing it as a parameter to many functions. The cache holds a Map, used to cache the sums for each number, yielding faster results (lookup versus calculation) on subsequent invocations for a particular number. Internal state is common and encouraged in the object-oriented world because OOP languages utilitize encapsulation as one of their benefits. Separating state often makes engineering practices such as unit testing easier, allowing easy injection of values.

The code in Example 2-10 is finely factored, with many small methods. This is a side effect of test-driven development, but it also allows me to show each part of the algorithm. I’ll gradually switch out some of the parts for more functional versions.

Slightly More Functional Number Classification

One of my goals when I wrote the code in Example 2-10 was testability. What if I additionally wanted to minimize shared state? To do so, I can eliminate the member variables and pass the needed values as parameters. Consider the updated version in Example 2-11.

Example 2-11. Slightly more functional number classifier

importjava.util.Collection;

importjava.util.Collections;

importjava.util.HashSet;

importjava.util.Set;

publicclassNumberClassifier {

publicstaticboolean isFactor(finalint candidate, finalint number) { 1

return number % candidate == 0;

}

publicstatic Set<Integer> factors(finalint number) { 2

Set<Integer> factors = new HashSet<>();

factors.add(1);

factors.add(number);

for (int i = 2; i < number; i++)

if (isFactor(i, number))

factors.add(i);

return factors;

}

publicstaticint aliquotSum(final Collection<Integer> factors) {

int sum = 0;

int targetNumber = Collections.max(factors);

for (int n : factors) { 3

sum += n;

}

return sum - targetNumber;

}

publicstaticboolean isPerfect(finalint number) {

return aliquotSum(factors(number)) == number;

}

4

publicstaticboolean isAbundant(finalint number) {

return aliquotSum(factors(number)) > number;

}

publicstaticboolean isDeficient(finalint number) {

return aliquotSum(factors(number)) < number;

}

}

1

All methods must accept number as a parameter—no internal state exists to hold it.

2

All methods are public static because they are pure functions, thus generically useful outside the number-classification realm.

3

Note the use of the most general reasonable parameter, aiding reuse at the function level.

4

The code is currently inefficient for repeating classifications; no caching.

In the slightly more functional NumberClassifier in Example 2-11, all the methods are really self-contained, pure functions (functions that have no side effects), with public, static scope. Because there is no internal state in this class, no reason exists to “hide” any of the methods. In fact, the factors method is potentially useful in many other applications, such as searching for prime numbers.

Typically, the finest-grained element of reuse in object-oriented systems is the class, and developers forget that reuse comes in smaller packages. For example, the sum method in Example 2-11 accepts a Collection<Integer> rather than a specific type of list. That interface is general for all collections of numbers, making it more generally reusable at the function level.

I also did not implement the caching scheme for sum in this solution. Keeping a cache implies persistent state, and I don’t really have any place to hold that state in this version. The code in Example 2-11 is also less efficient compared to the same functionality in Example 2-10. Because no internal state exists to hold the sum, it must be recalculated each time. In a later version in Chapter 4, I preserve statefulness and regain the cache via memoization, but for now I’ve lost it.

Java 8 Number Classifier

The most dramatic addition to Java 8 are lambda blocks, its version of higher-order functions. With this simple addition, Java developers have access to some of the same high-level abstractions that traditional functional languages use.

Consider the Java 8 version of the number classfier, shown in Example 2-12.

Example 2-12. Number classifier in Java 8

importjava.util.List;

importjava.util.stream.Collectors;

importjava.util.stream.IntStream;

importjava.util.stream.Stream;

importstatic java.lang.Math.sqrt;

importstatic java.util.stream.Collectors.toList;

importstatic java.util.stream.IntStream.range;

publicclassNumberClassifier {

publicstatic IntStream factorsOf(int number) {

return range(1, number + 1)

.filter(potential -> number % potential == 0);

}

publicstaticint aliquotSum(int number) {

return factorsOf(number).sum() - number;

}

publicstaticboolean isPerfect(int number) {

return aliquotSum(number) == number;

}

publicstaticboolean isAbundant(int number) {

return aliquotSum(number)> number;

}

publicstaticboolean isDeficient(int number) {

return aliquotSum(number) < number;

}

}

The code in Example 2-12 is dramatically shorter and simpler than in the original imperative version (Example 2-10) or the slightly more functional version (Example 2-11). In Example 2-12, the factorsOf() method returns an IntStream, allowing me to chain other operations onto it, including a terminating one that causes the stream to generate values. In other words, the return from factorsOf() isn’t a list of integers but rather a stream that hasn’t generated any values yet. Writing the aliquotSum() method is trivial: it is the sum of the list of factors minus the number itself. In Example 2-12, I wasn’t required to write the sum() method—in Java 8, it is one of the stream terminators that generates values.

In physics, energy is differentiated into potential, energy stored and ready to use, and kinetic, energy expenditure. For collections in languages such as Java before version 8, all collections acted as kinetic energy: the collection resolved values immediately, keeping no intermediate state. Streams in functional languages work more like potential energy, which is stored for later use. The stream holds the origin of the data (in Example 2-12, the origin comes from the range() method) and whatever criteria have been attached to the stream, such as filtering operations. The stream doesn’t convert from potential to kinetic until the developer “asks” for values, using a terminating operation such as forEach() or sum(). The stream is passable as a parameter and can have additional criteria added later to its potential until it becomes kinetic. This is an example oflazy evaluation, which I cover in detail in Chapter 4.

This style of coding was possible, with difficulty, in previous versions of Java, using some useful frameworks.

Functional Java Number Classifier

While all modern languages now include higher-order functions, many organizations stay on older versions of runtimes such as Java for many years for nontechnical reasons. Functional Java is an open source framework whose mission includes adding as many functional idioms to Java post version 1.5 with as little intrusiveness as possible. For example, because the Java 1.5-era JDK don’t include higher-order functions, Functional Java mimics their use via generics and anonymous inner classes. The number classifier takes on a different look when implemented using Functional Java’s idioms, as shown in Example 2-13.

Example 2-13. Number classifier using the Functional Java framework

importfj.F;

importfj.data.List;

importstatic fj.data.List.range;

publicclassNumberClassifier {

public List<Integer> factorsOf(finalint number) {

return range(1, number + 1) 1

.filter(new F<Integer, Boolean>() {

public Boolean f(final Integer i) {

return number % i == 0;

}

}); 2

}

publicint aliquotSum(List<Integer> factors) { 3

return factors.foldLeft(fj.function.Integers.add, 0) - factors.last();

}

publicboolean isPerfect(int number) {

return aliquotSum(factorsOf(number)) == number;

}

publicboolean isAbundant(int number) {

return aliquotSum(factorsOf(number)) > number;

}

publicboolean isDeficient(int number) {

return aliquotSum(factorsOf(number)) < number;

}

}

1

Ranges in Functional Java are noninclusive.

2

Filter rather than iterate.

3

Fold rather than iterate.

The primary differences between Examples 2-13 and 2-11 lie in two methods: aliquotSum() and factorsOf(). The aliquotSum() method takes advantage of a method on the List class in Functional Java, the foldLeft() method. In this case, a “fold left” means:

1. Take an initial value (0 in this case) and combine it via an operation on the first element of the list.

2. Take the result and apply the same operation to the next element.

3. Keep doing this until the list is exhausted.

Notice that this is exactly what you do when you sum a list of numbers: start with zero, add the first element, take that result and add it to the second, and continue until the list is consumed. Functional Java supplies the higher-order function (in this example, the Integers.add function) and takes care of applying it for you. Of course, Java didn’t have higher-order functions until Java 8. Functional Java uses anonymous inner classes to mimic the style if not the full capabilities of higher-order functions.

The other intriguing method in Example 2-13 is factorsOf(), which illustrates my “results over steps” mantra. What is the essence of the problem of discovering factors of a number? Stated another way, given a list of all possible numbers up to a target number, how do I determine which ones are factors of the number? This suggests a filtering operation—I can filter the entire list of numbers, eliminating those that don’t meet my criteria. The method basically reads like this description: take the range of numbers from 1 to my number (the range is noninclusive, hence the incrementation by 1); filter the list based on the code in the f() method, which is Functional Java’s way of allowing you to create a class with specific data types and return values.

In Example 2-13, I used the foldLeft() method, which collapses elements leftward, toward the first element. For addition, which is commutative, the direction doesn’t matter. If, on the other hand, I need to use an operation in which order matters, there is also a foldRight() variant.

NOTE

Higher-order abstractions eliminate friction.

You might think that the difference between the Functional Java version (Example 2-13) and the Java 8 version (Example 2-12) is merely syntactic sugar. (It is actually that plus more.) Yet syntactic convenience is important because syntax is the way you express ideas in a language.

I once had a memorable discussion with Martin Fowler in a taxicab in Barcelona, and we were talking about the waning of Smalltalk versus the waxing of Java. Fowler did extensive work in both and says that he initially viewed the switch from Smalltalk to Java as a syntactic inconvenience, but eventually as an impediment to the kinds of thinking afforded in the previous world. Placing syntactic hurdles around encouraged abstractions adds needless friction to the thought process.

NOTE

Don’t add needless friction.

Common Building Blocks

The categories of Useful Things I mentioned earlier appear in all the functional versions of the number classifier, with differing names. These Useful Things are ubiquitous in functional languages and frameworks.

Filter

A common operation on lists is filtering: creating a smaller list by filtering items in a list based on some user-defined criteria. Filtering is illustrated in Figure 2-1.

Filtering a list of numbers from a larger list

Figure 2-1. Filtering a list of numbers from a larger list

When filtering, you produce another list (or collection) potentially smaller than the original, depending on the filtering criteria. In the number-classifier example, I use filtering to determine the factors of a number, as shown in Example 2-14.

Example 2-14. Filtering in Java 8

publicstatic IntStream factorsOf(int number) {

return range(1, number + 1)

.filter(potential -> number % potential == 0);

}

The code in Example 2-14 creates a range of numbers from 1 to the target number, then applies the filter() method to eliminate numbers that aren’t factors of the target number: the Java modulus operator (%) returns the remainder from integer division, where a zero remainder indicates a factor.

Although it is possible to achieve the same results without lambda blocks (as in Example 2-13), it is more concise in a language with them. A Groovy version appears in Example 2-15.

Example 2-15. Using filtering (called findAll()) in Groovy

staticdef factors(number) {

(1..number).findAll {number % it == 0}

}

In Example 2-15, rather than pass a single parameter, I use the single-parameter substitution keyword it as a placeholder, and the last line of the method is the method’s return value, which is the list of factors in this case.

NOTE

Use filter to produce a subset of a collection based on supplied filtering criteria.

Map

The map operation transforms a collection into a new collection by applying a function to each of the elements, as illustrated in Figure 2-2.

Mapping a function onto a collection

Figure 2-2. Mapping a function onto a collection

To illustrate map() and related transformations, I create an optimized version of my number classifier. First, I create an imperative version, shown in Example 2-16.

Example 2-16. Number classifier optimized

importjava.util.HashMap;

importjava.util.HashSet;

importjava.util.Map;

importjava.util.Set;

importstatic java.lang.Math.sqrt;

publicclassImpNumberClassifier {

privateint _number; 1

private Map<Integer, Integer> _cache; 2

public ImpNumberClassifier(int targetNumber) {

_number = targetNumber;

_cache = new HashMap<>();

}

privateboolean isFactor(int candidate) {

return _number % candidate == 0;

}

private Set<Integer> getFactors() {

Set<Integer> factors = new HashSet<>();

factors.add(1);

factors.add(_number);

for (int i = 2; i <= sqrt(_number); i++) 3

if (isFactor(i)) {

factors.add(i);

factors.add(_number / i);

}

return factors;

}

privateint aliquotSum() {

int sum = 0;

for (int i : getFactors())

sum += i;

return sum - _number;

}

privateint cachedAliquotSum() { 4

if (_cache.containsKey(_number))

return _cache.get(_number);

else {

int sum = aliquotSum();

_cache.put(_number, sum);

return sum;

}

}

publicboolean isPerfect() {

return cachedAliquotSum() == _number;

}

publicboolean isAbundant() {

return cachedAliquotSum() > _number;

}

publicboolean isDeficient() {

return cachedAliquotSum() < _number;

}

}

1

Internal state to prevent passing number as a parameter everywhere

2

Internal cache for more efficient sum lookup

3

A performance optimization is embedded in the getFactors() method. Observe that factors can always be harvested in pairs. For example, if the number in question is 16, when I grab the factor 2 I can also grab 8 because 2 × 8 = 16. If I harvest factors in pairs, I only need to check for factors up to the square root of the target number, which is precisely what the getFactors() method does.

4

Method to return cached sum if possible

Groovy of course includes functional transformation functions; the Groovy version appears in Example 2-17, where Groovy’s version of map() is called collect().

Example 2-17. Groovy optimized factors

staticdef factors(number) {

def factors = (1..round(sqrt(number)+1)).findAll({number % it == 0})

(factors + factors.collect {number / it}).unique()

}

In Example 2-17, the final call to unique() removes duplicates in the list, ensuring that whole-number square roots (such as 4) don’t appear in the list twice. To see how far functional programming can transform your code, consider the version of the number classifier written in Clojure inExample 2-18.

Example 2-18. The (classify ) function in Clojure encapsulates all the behavior within assignments

(defn classify [num]

(let [factors (->> (range 1 (inc num)) ; 1

(filter #(zero? (rem num %)))) ; 2

sum (reduce + factors) ; 3

aliquot-sum (- sum num)] ; 4

(cond ; 5

(= aliquot-sum num) :perfect

(> aliquot-sum num) :abundant

(< aliquot-sum num) :deficient)))

1

Methods become assignments.

2

Assign factors to filtered range.

3

Assign sum to reduced factors.

4

Calculate aliquot sum.

5

Return the keyword (enumeration) indicating the category.

If your functions collapse to single lines, you can shorten function definitions to a list of assigments, which is what happens in Example 2-18. The (let []) block in Clojure allows the creation of function-scoped assignments. First, I calculate the factors of the target number. To perform this calculation, I gather the range from 1 to the number ((range 1 (inc num)))—I call (inc num) because ranges in Clojure are noninclusive. Next, I use the (filter ) method to eliminate unwanted factors. Usually, in Clojure, I write this expression as (filter #(zero? (rem num %)) (range 1 (inc num))). However, because conceptually I generate the range first, I would rather it appear first as I read the code. Clojure’s thread-last macro (the ->> operator in Example 2-18) allows me to reorder it for readability. The assignment of sum and aliquot-sum are straightforward after I have the factors. The remaining body of the function compares the aliquot-sum in each case and returns the appropriate keyword (an enumeration delineated with a leading colon).

NOTE

Use map to transform collections in situ.

Fold/Reduce

The third common function has the most variations in name, and many subtle differences, among popular languages. foldLeft and reduce are specific variations on a list-manipulation concept called catamorphism, which is a generalization of list folding.

The reduce and fold operations have overlapping functionality, with subtle differences from language to language. Both use an accumulator to gather values. The reduce function is generally used when you need to supply an initial value for the accumulator, whereas fold starts with nothing in the accumulator. The order of operation on the collection is specified by the specific method name (for example, foldLeft or foldRight). Neither of these operations mutates the collection.

I show the foldLeft() function in Functional Java. In this case, a “fold left” means:

§ Use a binary function or operator to combine the first element of the list, if any, with the initial value of an accumulator.

§ Repeat step 1 until the list is exhausted and the accumulator holds the result of the fold.

Notice that this is exactly what you do when you sum a list of numbers: start with zero, add the first element, take that result and add it to the second, and continue until the list is consumed.

The aliquotSum() method in the Functional Java number classifier performs a sum across all the gathered factors and appears in Example 2-19.

Example 2-19. The foldLeft() method from Functional Java

publicint aliquotSum(List<Integer> factors) {

return factors.foldLeft(fj.function.Integers.add, 0) - factors.last();

}

At first it’s not obvious how the one-line body in Example 2-19 performs a sum operation to calculate the aliquotSum. In this case, the fold operation refers to a transformation that combines each element of the list with the next one, accumulating a single result for the entire list. A fold left combines the list elements leftward, starting with a seed value and accumulating each element of the list in turn to yield a final result. Figure 2-3 illustrates a fold operation.

Fold operation

Figure 2-3. Fold operation

Because addition is commutative, it doesn’t matter if you do a foldLeft() or foldRight(). But some operations (including subtraction and division) care about order, so the foldRight() method exists to handle those cases. In purely functional languages, left and right folds have implementation differences. For example, right folds can operate on infinite lists whereas left folds cannot.

Example 2-13 uses the Functional Java-supplied add enumeration; the framework includes the most common mathematical operations for you. But what about cases in which you need more refined criteria? Consider the code in Example 2-20.

Example 2-20. foldLeft() with user-supplied criteria

staticpublicint addOnlyOddNumbersIn(List<Integer> numbers) {

return numbers.foldLeft(new F2<Integer, Integer, Integer>() {

public Integer f(Integer i1, Integer i2) {

return (!(i2 % 2 == 0)) ? i1 + i2 : i1;

}

}, 0);

Functional Java is designed to work with pre-Java 8 JDKs, forcing the framework to improvise with single-method interfaces and anonymous inner classes. The built-in F2 class has the correct structure for a fold operation: it creates a method that accepts two integer parameters (these are the two values being folded upon one another) and the return type.

The reduce operation for Groovy’s number classifier appears in Example 2-21.

Example 2-21. Groovy’s version of reduce() (called inject())

staticdef sumFactors(number) {

factors(number).inject(0, {i, j -> i + j})

}

Groovy’s inject method uses the same signature of reduce shown in Example 2-18; the first parameter expects a seed value, and the second is a closure block that accepts two parameters and returns a single value. In this case, I add the two passed parameters: {i, j → i + j}.

Either fold or reduce is commonly used when you need to process a collection of items to produce a different-sized (usually smaller but not necessarily) value, which could be a collection or a single value.

NOTE

Use reduce or fold for piecewise collection processing.

Number classification is of course a contrived example, so it’s hard to generalize to different types of problems. However, I’ve noticed a significant change in coding style on projects using languages that support these abstractions (whether they are functional languages or not). I first noticed this on Ruby on Rails projects. Ruby has these same list-manipulation methods that use closure blocks, and I was struck by how frequently collect(), map(), and inject() appear. Once you become accustomed to having these tools in your toolbox, you’ll find yourself turning to them again and again.

One of the challenges of learning a different paradigm such as functional programming is learning the new building blocks and “seeing” them peek out of problems as a potential solution. In functional programming, you have far fewer abstractions, but each one is generic (with specificity added via higher-order functions). Because functional programming relies heavily on passed parameters and composition, you have fewer rules to learn about the interactions among moving parts, making your job easier.

Synonym Suffering

Functional programming languages feature several families of common functions. Yet developers sometimes have a hard time moving between languages because familiar functions have unfamiliar names. Functional languages tend to name these common functions based on functional paradigms. Languages that are derived from scripting backgrounds tend to use more descriptive names (sometimes multiple names, with several aliases that point to the same function).

Filter

With the filter function, you can specify a Boolean criterion (typically in the form of a higher-order function) to apply to a collection. The function returns the subset of the collection whose elements match the criterion. Filtering is closely related to find functions, which return the first matching element in a collection.

Scala

Scala has many filter variants. The simplest case filters a list based on passed criteria. In this first example, I create a list of numbers. Then I apply the filter() function, passing a code block that specifies the criterion that all elements must be divisible by 3:

val numbers =List.range(1, 11)

numbers filter (x => x % 3 == 0)

// List(3, 6, 9)

I can create a terser version of the code block by relying on implicit parameters:

numbers filter (_ % 3 == 0)

// List(3, 6, 9)

This second version is less verbose because Scala allows substitution of parameters with underscores. Both versions yield the same result.

Many examples of filtering operations use numbers, but filter() applies to any collection. This example applies filter() to a list of words to determine the three-letter words:

val words =List("the", "quick", "brown", "fox", "jumped",

"over", "the", "lazy", "dog")

words filter (_.length == 3)

// List(the, fox, the, dog)

Another filtering variant in Scala is the partition() function, which returns an alternate version of a collection by splitting it into multiple parts; the original collection is unchanged. The split is based on the higher-order function that you pass to determine the separation criteria. Here, thepartition() function returns two lists that are split according to which list members are divisible by 3:

numbers partition (_ % 3 == 0)

// (List(3, 6, 9),List(1, 2, 4, 5, 7, 8, 10))

The filter() function returns a collection of matching elements, whereas find() returns only the first match:

numbers find (_ % 3 == 0)

// Some(3)

However, the return value for find() isn’t the matched value itself, but rather one that’s wrapped in an Option class. Option has two possible values: Some or None. Scala, like some other functional languages, uses Option as a convention to avoid returning null in the absence of a value. The Some() instance wraps the actual return value, which is 3 in the case of numbers find (_ % 3 == 0). If I try to find something that doesn’t exist, the return is None:

numbers find (_ < 0)

// None

I discuss Option and similar classes in depth in Chapter 5.

Scala also includes several functions that process a collection based on a predicate function and either retain values or discard them. The takeWhile() function returns the largest set of values from the front of the collection that satisfy the predicate function:

List(1, 2, 3, -4, 5, 6, 7, 8, 9, 10) takeWhile (_ > 0)

// List(1, 2, 3)

The dropWhile() function skips the largest number of elements that satisfy the predicate:

words dropWhile (_ startsWith "t")

// List(quick, brown, fox, jumped, over, the, lazy, dog)

Groovy

Groovy isn’t considered a functional language, but it contains many functional paradigms—some with names that are derived from scripting languages. For example, the function that’s traditionally called filter() in functional languages is Groovy’s findAll() method:

(1..10).findAll {it % 3 == 0}

// [3, 6, 9]

Like Scala’s filter functions, Groovy’s work on all types, including strings:

def words = ["the", "quick", "brown", "fox", "jumped",

"over", "the", "lazy", "dog"]

words.findAll {it.length() == 3}

// [The, fox, the, dog]

Groovy also has a partition()-like function called split():

(1..10).split {it % 3}

// [ [1, 2, 4, 5, 7, 8, 10], [3, 6, 9] ]

The return value of the split() method is a nested array, like the nested list in Scala that’s returned from partition().

Groovy’s find() method returns the first match from the collection:

(1..10).find {it % 3 == 0}

// 3

Unlike Scala, Groovy follows Java conventions and returns null when find() fails to find an element:

(1..10).find {it < 0}

// null

Groovy also has takeWhile() and dropWhile() methods with similar semantics to Scala’s versions:

[1, 2, 3, -4, 5, 6, 7, 8, 9, 10].takeWhile {it > 0}

// [1, 2, 3]

words.dropWhile {it.startsWith("t")}

// [quick, brown, fox, jumped, over, the, lazy, dog]

As in the Scala example, dropWhile() acts as a specialized filter: it drops the largest prefix that matches the predicate, filtering only the first part of the list:

def moreWords = ["the", "two", "ton"] + words

moreWords.dropWhile {it.startsWith("t")}

// [quick, brown, fox, jumped, over, the, lazy, dog]

Clojure

Clojure has an astounding number of collection-manipulation routines, often usefully generic because of Clojure’s dynamic typing. Many developers gravitate toward Clojure because of the richness and flexibility of its collection libraries. Clojure uses traditional functional programming names, as shown by the (filter ) function:

(def numbers (range 1 11))

(filter (fn [x] (= 0 (rem x 3))) numbers)

; (3 6 9)

Like the other languages, Clojure has a terser syntax for simple anonymous functions:

(filter #(zero? (rem % 3)) numbers)

; (3 6 9)

And as in the other languages, Clojure’s functions work on any applicable type, such as strings:

(def words ["the" "quick" "brown" "fox" "jumped" "over" "the" "lazy" "dog"])

(filter #(= 3 (count %)) words)

; (the fox the dog)

Clojure’s return type for (filter ) is a Seq, which is delineated by parentheses. The Seq interface is the core abstraction for sequential collections in Clojure.

Map

The second major functional transformation that’s common to all functional languages is map. A map function accepts a higher-order function and a collection, then applies the passed function to each element and returns a collection. The returned collection (unlike with filtering) is the same size as the original collection, but with updated values.

Scala

Scala’s map() function accepts a code block and returns the transformed collection:

List(1, 2, 3, 4, 5) map (_ + 1)

// List(2, 3, 4, 5, 6)

The map() function works on all applicable types, but it doesn’t necessarily return a transformed version of each element of the collection. In this example, I return a list of the sizes of all elements in a string:

words map (_.length)

// List(3, 5, 5, 3, 6, 4, 3, 4, 3)

Nested lists occur so often in functional programming languages that library support for denesting—typically called flattening—is common. Here is an example of flattening a nested list:

List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9)) flatMap (_.toList)

// List(1, 2, 3, 4, 5, 6, 7, 8, 9)

The resulting List contains just the elements, with the extra infrastructure removed. The flatMap() function also works on data structures that might not seem nested in the traditional way. For example, you can consider a string as a series of nested characters:

words flatMap (_.toList)

// List(t, h, e, q, u, i, c, k, b, r, o, w, n, f, o, x, ...

Groovy

Groovy also includes several map variants called collect(). The default variant accepts a code block to apply to each element of the collection:

(1..5).collect {it += 1}

// [2, 3, 4, 5, 6]

Like the other languages, Groovy allows shorthand for simple anonymous higher-order functions; using the it reserved word for parameter substitution.

The collect() method works on any collection, as long as you can supply a reasonable predicate (a function that returns true or false). Thus, you can use collect() on a list of strings:

def words = ["the", "quick", "brown", "fox", "jumped",

"over", "the", "lazy", "dog"]

words.collect {it.length()}

// [3, 5, 5, 3, 6, 4, 3, 4, 3]

Groovy also has a method similar to flatMap() that collapses inner structure, called flatten():

[ [1, 2, 3], [4, 5, 6], [7, 8, 9] ].flatten()

// [1, 2, 3, 4, 5, 6, 7, 8, 9]

The flatten() method also works on nonobvious collections such as strings:

(words.collect {it.toList()}).flatten()

// [t, h, e, q, u, i, c, k, b, r, o, w, n, f, o, x, j, ...

Clojure

Clojure includes a (map ) function that accepts a higher-order function (which includes operators) and a collection:

(map inc numbers)

; (2 3 4 5 6 7 8 9 10 11)

The first parameter for (map ) can be any function that accepts a single parameter: a named function, an anonymous function, or a preexisting function such as inc that increments its argument. The more typical anonymous syntax is illustrated in this example, which generates a collection of the lengths of the words in a string:

(map #(count %) words)

; (3 5 5 3 6 4 3 4 3)

Clojure’s (flatten ) function is similar to Groovy’s:

(flatten [[1 2 3] [4 5 6] [7 8 9]])

; (1 2 3 4 5 6 7 8 9)

Fold/Reduce

The third common function has the most variations in name, and many subtle differences.

Scala

Scala has the richest set of fold operations, in part because it facilitates several typing scenarios that don’t appear in the dynamically typed Groovy and Clojure. Reduce is commonly used to perform sums:

List.range(1, 10) reduceLeft((a, b) => a + b)

// 45

The function that’s supplied to reduce() is typically a function or operator that accepts two arguments and returns a single result, thereby consuming the list. You can use Scala’s syntactic sugar to shorten the function definition:

List.range(1, 10).reduceLeft(0)(_ + _)

// 45

The reduceLeft() function assumes that the first element is the left side of the operation. For operators such as plus, the placement of operands is irrelevant, but order matters for operations such as divide. If you want to reverse the order in which the operator is applied, usereduceRight():

List.range(1, 10) reduceRight(_ - _)

// 8 - 9 = -1

// 7 - (-1) = 8

// 6 - 8 = -2

// 5 - (-2) = 7

// 4 - 7 = -3

// 3 - (-3) = 6

// 2 - 6 = -4

// 1 - (-4) = 5

// result: 5

When I mentioned reverse above, the application may not be intiuitive. The reduceRight() function reduces the direction of operations but not the ordering of parameters. Thus, it applies 8 - 9 first, then uses that result as the second parameter in subsequent calculations.

Understanding when you can use higher-level abstractions such as reduce is one of the keys to mastery of functional programming. This example uses reduceLeft() to determine the longest word in a collection:

words.reduceLeft((a, b) =>if (a.length > b.length) a else b)

// jumped

The reduce and fold operations have overlapping functionality, with subtle distinctions that I discussed earlier. One obvious difference suggests common uses. In Scala, the signature of reduceLeft[B >: A](op: (B, A) => B): B shows that the only parameter that’s expected is the function to combine elements. The initial value is expected to be the first value in the collection. In contrast, the signature of foldLeft[B](z: B)(op: (B, A) => B): B indicates an initial seed value for the result, so you can return types that are different from the type of the list elements.

Here’s an example of summing a collection by using foldLeft():

List.range(1, 10).foldLeft(0)(_ + _)

// 45

Scala supports operator overloading, so the two common fold operations, foldLeft and foldRight, have corresponding operators: /: and :\, respectively. Thus, you can create a terser version of sum by using foldLeft:

(0 /: List.range(1, 10)) (_ + _)

// 45

Similarly, to find the cascading difference between each member of a list (the reverse of a sum operation, admittedly a rare requirement) you can use either the foldRight() function or the :\ operator:

(List.range(1, 10) :\ 0) (_ - _)

// 5

Groovy

Groovy’s entry in the reduce category uses overloading of the inject() function to support the same functionality as Scala’s reduce() and foldLeft() options. One version of the function accepts an initial value. This example uses the inject() method to generate a sum over a collection:

(1..10).inject {a, b -> a + b}

// 55

The alternate form accepts an initial value:

(1..10).inject(0, {a, b -> a + b})

// 55

Groovy has a much smaller functional library than Scala or Clojure—not surprising in view of the fact that Groovy is a multiparadigm language that doesn’t emphasize functional programming.

Clojure

Clojure is primarily a functional programming language, so it supports (reduce ). The (reduce ) function accepts an optional initial value to cover both the reduce() and foldLeft() cases in Scala. The (reduce ) function brings no surprises. It accepts a function that expects two arguments and a collection:

(reduce + (range 1 11))

; 55

Clojure includes advanced support for reduce-like functionality in a library called Reducers.

Part of the challenge of learning a different paradigm (such as functional programming) is learning new terminology. That effort is complicated when different communities use different terms. But once you grasp the similarities, you see that functional languages offer overlapping functionality in syntactically surprising ways.