Becoming Functional (2014)
Chapter 6. Strict and Nonstrict Evaluations
Evaluations are the execution of a statement, usually the execution and setting of a variable. So what exactly does it mean to have a strict versus a nonstrict evaluation? Generally speaking, we as developers use strict evaluations. This means that the statement is immediately evaluated and assigned to the variable as soon as the variable is defined.
This obviously means that with nonstrict evaluations we don’t assign the variable where it is defined. This is also known as a “lazy” variable; the variable isn’t actually assigned until the first time it is used. This is really useful when we have variables that may not be used in a specific situation. Let’s look at a mathematical example.
MATH WARNING
Let’s assume that we have three functions: a(x), b(x), and f(x).
If we look at this equation, we know to evaluate b(x) first because if it equals 0, there is no point in evaluating a(x) given that the entire equation fails. Our lazy value is a(x), and this is the point of a lazy variable.
When thinking of lazy variables, we tend to also think of mutable variables, because we think of the variable being defined and eventually being set. We normally think about the Java example in Example 6-1. However, with nonstrict evaluation, we maintain immutability; the variable is assigned or evaluated only on the first reference. This means that before the variable is used, it doesn’t exist; and as soon as it’s referenced, the variable becomes defined.
Example 6-1. A lazy variable in Java
publicstaticdouble f(int x) {
int brtn = b(x);
if(brtn == 0) {
thrownew IllegalArgumentException("Input gave a 0 value from b(x)");
}
return a(x) / brtn;
}
Your boss at XXY has asked that you create a new function that can get a list of enabled Contacts for all enabled Customers. Let’s start with the simplest implementation by using a method. The method will be called enabledContacts(), and we’ll add it to the Customer. We see this implementation in Example 6-2.
Example 6-2. All enabled contacts method in Customer.java
public List<Contact> enabledContacts() {
contacts.findAll { contact -> contact.enabled }
}
Well, that was pretty easy, but what happens if we call this multiple times? That’s an easy fix: let’s just make this into a member variable instead of a method.
Strict Evaluation
So, strict evaluation means that we will create and evaluate the setting of the variable at the time we define it. This is how we normally think of variables, so let’s go ahead and initialize our enabledContacts member during the creation of the Customer object, as shown in Example 6-3.
Example 6-3. All enabled contacts member being set in constructor
this.enabledContacts = contacts.findAll { contact -> contact.enabled }
Awesome—now we have our enabledContacts member, which can be accessed as many times as we want and we don’t have to worry about rerunning the findAll. So let’s go ahead and write our code to actually obtain all enabled Contacts for all enabled Customers. We’ll need to add a quick function call to flatten() because our enabledContacts is a list, and we’re collecting a list of those lists to have a result of List<List<Contact>>. The call to flatten() will collapse all of the inner lists together and return a List<Contact> (Example 6-4).
Example 6-4. Iterate over all customers and get their enabledContacts
Customer.allCustomers.findAll { customer ->
customer.enabled
}.collect { customer ->
customer.enabledContacts()
}.flatten()
Uh oh, our boss has come back saying that the application is taking forever to start up and run. Since we’re using static evaluation, we’re actually creating our enabledContacts list even if the Customer was disabled; so how can we skip evaluating the variable if we don’t need it? Lazy evaluation allows us to define the variable but not evaluate its value until the first time it is referenced.
Nonstrict (Lazy) Evaluation
So let’s start by following the normal imperative method most people would use to accomplish this. We’ll create the member as private and then add a getter method. We’ll then synchronize the method and check to see if the object is initialized, creating it if not, and then return that (Example 6-5).
Example 6-5. All enabled contacts method with deduplication in Customer.java
private List<Contact> enabledContacts = null
publicsynchronized List<Contact> getEnabledContacts() {
if(this.enabledContacts == null) {
this.enabledContacts = this.contacts.findAll { contact ->
contact.enabled
}
}
returnthis.enabledContacts
}
Obviously this works, but it’s really undesirable because now we have a completely different methodology to access the enabledContacts member. We’re actually going to be calling a method rather than doing a simple member access. Good thing we’re using Groovy and we get the@Lazy annotation!
Before we start throwing around the @Lazy annotation, let’s actually play around with lazy variables in separate scripts. We’ll create a simple class TestClass, which will have an array of numbers from 1 to 6, and another that contains only the odd numbers, as shown in Example 6-6.
RUNNING EXAMPLES
For the rest of the examples in this chapter, these are all scripts and there is no need for compilation.
Groovy examples
Copy the code into a file and run “groovy filename.groovy”.
Scala example
Copy the code into a file and run “scala filename.scala”.
Example 6-6. TestClass with nonlazy member
classTestClass {
def all = [1,2,3,4,5,6]
def odd = all.findAll { num -> num%2 == 1 }
}
println(new TestClass().odd)
So we obviously know that the member odd gets initialized as soon as we call new TempClass(). But let’s verify this by modifying the code a bit, as in Example 6-7.
Example 6-7. TestClass with nonlazy member and print statements
classTestClass {
def all = [1,2,3,4,5,6]
def odd = all.findAll { num -> println("Foo"); num%2 == 1; }
}
def tc = new TestClass()
println("Bar")
println(tc.odd)
As assumed, we see a bunch of "Foo" statements get printed followed by a "Bar", and finally the array itself. But we can change this functionality by adding the @Lazy annotation to the odd member, as shown in Example 6-8.
Example 6-8. TestClass with lazy member and print statements
classTestClass {
def all = [1,2,3,4,5,6]
@Lazy def odd = all.findAll { num -> println("Foo"); num%2 == 1; }
}
def tc = new TestClass()
println("Bar")
println(tc.odd)
As we can see, we have the "Bar" printed out followed by a bunch of "Foo" statements and finally the array. Notice that the odd member doesn’t actually get evaluated until it’s referenced. Now, this has a really nasty side effect: if we were to change all before we called odd, then when we do call odd we’re going to be getting the new evaluation based on the new value of all. This is shown in Example 6-9.
Example 6-9. TestClass with lazy member; we change the all variable before referencing odd
classTestClass {
def all = [1,2,3,4,5,6]
@Lazy def odd = all.findAll { num -> num%2 == 1 }
}
def tc = new TestClass()
tc.all = [1,2,3]
println(tc.odd)
The output here is the list of odd numbers but only between 1 and 3 (because we referenced odd after we had changed the all variable). So what happens if we reference odd before we change the all variable? Does this mean that the variable odd would be set and would no longer be updated? Let’s see this in Example 6-10.
Example 6-10. TestClass with lazy member; we change the all variable reference after referencing odd
classTestClass {
def all = [1,2,3,4,5,6]
@Lazy def odd = all.findAll { num -> num%2 == 1 }
}
def tc = new TestClass()
println(tc.odd)
tc.all = [1,2,3]
println(tc.odd)
We see two lists printed out that are exactly the same; they are of odd numbers from 1 to 5. Wait—we changed all, which should mean that the second list we printed out should’ve been odd numbers, but only from 1 to 3. Ah, but as we said before: the laziness of the odd variable means that the evaluation only occurs once. This means on the first reference of odd, it will be set and will not be reevaluated.
So now, let’s make use of the @Lazy annotation on our enabledContacts variable, as in Example 6-11.
BEING LAZY HAS ITS OWN QUIRKS
In Groovy, when we use the @Lazy annotation, the Groovy compiler generates a getter for the member, which does a lazy generation of the member. This means that it will create it on the first access if it doesn’t already exist, but if it does will reuse it. This works until you try to use the final modifier.
Groovy will then pass the final modifier directly to Java, and you will end up trying to modify a final variable due to the way @Lazy works.
Example 6-11. All enabled contacts as lazy member in Customer.java
@Lazy publicvolatile List<Contact> enabledContacts = contacts.findAll { contact ->
contact.enabled
}
CONCURRENCY NOTICE
In Groovy, you will need to add the volatile keyword when using @Lazy; otherwise, this code gets converted into non–thread-safe code.
In Example 6-12, let’s look at a lazy variable definition in Scala for comparison.
Example 6-12. All enabled contacts as lazy member in Scala
lazyval enabledContacts = contacts.filter { contact ->
contact.enabled
}
Notice that lazy becomes a modifier. For those not familiar with Scala, defining a variable is done with val or var, meaning an immutable or mutable variable respectively. Finally, we filter our contacts. Notice that the big difference between Scala and Groovy within the anonymous function syntax is switching from -> to =>, which separates our parameters from the body.
Laziness Can Create Problems
Sometimes creating lazy variables can cause problems; for example, let’s say that you have a variable that a large number of threads rely on. If you use a lazy variable, this means that all the threads will block until the variable has been computed.
Let’s see an example where doing lazy variables might be worse than if we just took the time to compute it in the beginning. We’re going to step away from XXY and look at a simple example. Let’s assume that we have a Customer container, as shown in Example 6-13.
Example 6-13. Problem with laziness shown in Groovy
classCustomer {
final Integer id
final Boolean enabled
public Customer(id, enabled) { this.id = id; this.enabled = enabled; }
}
classCustomerContainer {
public List<Customer> customers = []
@Lazy publicvolatile List<Customer> onlyEnabled = {
customers.findAll { customer ->
customer.enabled
}
}()
public CustomerContainer() { this([]) }
public CustomerContainer(customers) { this.customers = customers }
def addCustomer(c) {
new CustomerContainer(customers.plus(customers.size(), [c]))
}
def removeCustomer(c) {
new CustomerContainer(customers.findAll { customer -> customer.id != c.id })
}
}
def cc = new CustomerContainer()
cc = cc.addCustomer(new Customer(1, true))
cc = cc.addCustomer(new Customer(2, false))
println(cc.customers)
So now we have a container that we can keep updating in a thread-safe manner. Notice, though, that we have our onlyEnabled as a @Lazy variable. The unfortunate part here is that the runtime slows down if we are constantly changing the container and we have a multitude of threads. Each time the container refreshes, all threads will block on access to the onlyEnabled field the first time it is accessed. Let’s try to fix this in Example 6-14.
Example 6-14. Problem with laziness in Groovy, fixed
classCustomer {
final Integer id
final Boolean enabled
public Customer(id, enabled) { this.id = id; this.enabled = enabled; }
}
classCustomerContainer {
public List<Customer> customers = []
public List<Customer> onlyEnabled = []
public CustomerContainer() { this([]) }
public CustomerContainer(customers) {
this.customers = customers
this.onlyEnabled = customers.findAll { customer -> customer.enabled }
}
def addCustomer(c) {
new CustomerContainer(customers.plus(customers.size(), [c]))
}
def removeCustomer(c) {
new CustomerContainer(customers.findAll { customer -> customer.id != c.id })
}
}
def cc = new CustomerContainer()
cc = cc.addCustomer(new Customer(1, true))
cc = cc.addCustomer(new Customer(2, false))
println(cc.customers)
By removing the @Lazy annotation, the only thread responsible for adding/removing customers will be the one that blocks and takes the time to populate our list. Now, the rest of our threads can continue to process requests without blocking on the first call to onlyEnabled.
But where would a good place to use laziness be in this example? Let’s assume that there is a revenue number tied to every customer which is based on their contracts. In example Example 6-15 there is a revenue variable in our Customer class, but we don’t always need to evaluate that variable, which is why we’ve used a @Lazy variable.
Example 6-15. Lazy calculation of revenue variable in Groovy
classCustomer {
final Integer id
final Boolean enabled
final List<Double> contracts
@Lazy volatile Double revenue = calculateRevenue(this.contracts)
staticdef calculateRevenue(contracts) {
Double sum = 0.0
for(Double contract : contracts) {
sum += contract
}
sum
}
public Customer(id, enabled, contracts) {
this.id = id
this.enabled = enabled
this.contracts = contracts
}
}
classCustomerContainer {
public List<Customer> customers = []
public List<Customer> onlyEnabled = []
public CustomerContainer() { this([]) }
public CustomerContainer(customers) {
this.customers = customers
this.onlyEnabled = customers.findAll { customer -> customer.enabled }
}
def addCustomer(c) {
new CustomerContainer(customers.plus(customers.size(), [c]))
}
def removeCustomer(c) {
new CustomerContainer(customers.findAll { customer -> customer.id != c.id })
}
}
def cc = new CustomerContainer()
cc = cc.addCustomer(new Customer(1, true, [100.0, 200.0, 300.0]))
cc = cc.addCustomer(new Customer(2, false, [100.0, 150.0, 500.0]))
println(cc.customers)
Double sum = 0.0
for(Customer customer : cc.onlyEnabled) {
sum += customer.revenue
}
println("Enabled Revenue: ${sum}")
Since we’re going to be diving into Scala due to its increased focus on functional programming, in Example 6-16, the exact same functionality shown in Example 6-15 is rewritten in Scala. This is for a direct comparison and will give you a good idea of the syntax and some basics of Scala.
Example 6-16. Scala representation of Example 6-15
classCustomer(val id :Integer,
val enabled :Boolean,
val contracts :List[Double]) {
lazyval revenue :Double = calculateRevenue(this.contracts)
def calculateRevenue(contracts :List[Double]) :Double = {
var sum :Double = 0.0
for(contract <- contracts) {
sum += contract
}
sum
}
}
classCustomerContainer(val customers :List[Customer] =List()) {
val onlyEnabled = customers.filter { customer => customer.enabled }
def addCustomer(c :Customer) :CustomerContainer = {
newCustomerContainer(customers ::: List(c))
}
def removeCustomer(c :Customer) :CustomerContainer = {
newCustomerContainer(customers.filter { customer => customer.id != c.id })
}
}
var cc =newCustomerContainer()
cc = cc.addCustomer(newCustomer(1, true, List(100.0, 200.0, 300.0)))
cc = cc.addCustomer(newCustomer(2, false, List(100.0, 150.0, 500.0)))
println(cc.customers)
var sum :Double = 0.0
for(customer <- cc.onlyEnabled) {
sum += customer.revenue
}
println(s"Enabled Revenue: ${sum}")
Conclusion
Lazy evaluations have allowed us to speed up the runtime of our application, since we only need to build our enabledCustomers when we need it. We’ve also learned that there are times we need to be careful, as we may end up blocking all of our threads from working while the lazy variable is evaluated.
There are obvious pros and cons to utilizing strict and nonstrict (lazy) evaluations; learning when and where to use them is important in producing good functional code. It allows us to describe variables that we don’t necessarily want to waste processing time on if we don’t need to.
Many of you may have already seen some of these concepts in Object-Relational Mappers (ORMs) such as Hibernate with a lazy fetch. Generally you use it in relationships between objects, so that you don’t load hundreds of relationships unless you absolutely need to.
Now think about when you may not want to. For example, you might have a Contact object as well as a linkage to its friends which were also Contacts. Maybe you need that every time the user logs in; if so, a lazy variable is not going to help you.
Generally speaking, strict evaluation is important when you have frequently accessed members of an object—especially if they exist in a multithreaded environment and are being used by all threads. On the other hand, if you have variables that are referenced infrequently or are extremely expensive to compute, it’s more useful to evaluate them only if absolutely necessary.