Immutable Variables - Becoming Functional (2014)

Becoming Functional (2014)

Chapter 4. Immutable Variables

Immutable variables is a topic that gives everyone the shudders when they first get into it. Let’s get the big question out of the way first: how can an application run if variables never change? This is a good question, so let’s look at the following rules about immutability:

§ Local variables do not change.

§ Global variables can change only references.

Object variables, especially in Java, are references to the object itself. This means that changing the “reference” to which the variable points should be an atomic process. This is important because if we are going to update the variable, we will access it either pre- or post-update but never in an intermediate state. We’ll discuss this a little later, but right now, let’s look at mutability.

WE’RE GETTING GROOVY NOW

Remember from the preceding chapter that we’re going to be writing in Groovy from this point on.

Mutability

When we think of variables, we normally think of mutable variables. After all, a variable is variable, which means that we should be able to store many different values in it and reuse it.

As we think of mutable variables, we realize that this is how we normally write code—with variables that inherently change over time. In Example 4-1, notice how f changes and is assigned two distinct values? This is how we normally deal with variables.

Example 4-1. Modifying a variable

def f = 10

f = f + f

So what happens when we have a variable that is passed to a function and we try to mutate that? Let’s see in Example 4-2.

Example 4-2. Modifying a variable passed to a function

def f = "Foo"

def func(obj) {

obj = "Bar"

}

println f

func(f)

println f

We can see from the output that we get two "Foo" printouts. This is correct because the reference that f contained, "Foo", was passed to func, and then we update the variable obj with a new reference to "Bar". But because there is no connection between obj and f, f remains unchanged and contains our original reference to "Foo".

This was probably not what the author intended, so he fixes it by using a mutable object containing the reference he wants to change. Let’s see this in action in Example 4-3.

Example 4-3. Modifying a variable passed into a function

classFoo {

String str

}

def f = new Foo(str: "Foo")

def func(Foo obj) {

obj.str = "Bar"

}

println f.str

func(f)

println f.str

We can see that, although f didn’t change, f.str did. This looks like it’s a fairly standard mutation of an object, but let’s think about this in another light. What if it were not clear that func was going to mutate f.str, and we now need to determine why f.str has changed over time? We’ll need to debug to find out that func is indeed changing our variable.

Using code comments or setting something in the name of the function to indicate that you are mutating the object is one way to help answer the question “Why did this change?” Immutability gives us the confidence that our variables will not be changing and that our objects will be the same no matter to which function we send them.

Let’s head back over to XXY. Your boss has come back with another request, this time a little more sane. He needs to send emails to the customers if the following conditions are met:

§ The Customer is enabled.

§ The Contract is enabled.

§ The Contract has not expired.

§ The Contact is still enabled.

The boss has indicated that this really shouldn’t be a big deal because someone else already added a list of Contacts to the Customer class. The definition of a Contact is in the Contact.java file, shown in Example 4-4.

Example 4-4. Contact.java file

publicclassContact {

public Integer contact_id = 0;

public String firstName = "";

public String lastName = "";

public String email = "";

public Boolean enabled = true;

public Contact(Integer contact_id,

String firstName,

String lastName,

String email,

Boolean enabled) {

this.contact_id = contact_id;

this.firstName = firstName;

this.lastName = lastName;

this.email = email;

this.enabled = enabled;

}

}

The message template is as follows, where <firstName> and <lastName> are placeholders to be replaced by the user’s name:

Hello <firstName> <lastName>,

We would like to let you know that a new product is available for you to try. Please feel free to give us a call at 1-800-555-1983 if you would like to see this product in action.

Sincerely, Your Friends at XXY

We’re going to add the functionality into the Customer class. Let’s think about this functionally. First, we will findAll Customer.allCustomer records where both the customer is enabled and the customer’s contract is enabled. For each of those customers, we will then findAllcontacts that are enabled. And finally, for each of those contacts, we will sendEmail. Let’s go ahead and write the code in Groovy, as seen in Example 4-5.

Example 4-5. sendEnabledCustomersEmail method

publicstaticvoid sendEnabledCustomersEmails(String msg) {

Customer.allCustomers.findAll { customer ->

customer.enabled && customer.contract.enabled

}.each { customer ->

customer.contacts.findAll { contact ->

contact.enabled

}.each { contact ->

contact.sendEmail(msg)

}

}

}

I don’t want to get too far into a battle about how best to handle sending emails, so let’s assume that we’ve already written Contact.sendEmail, which takes a string, performs a replace for member variables, and then sends out the email. Let’s get even more functional—we might need to do something else later for each enabled Contact. So, let’s use a closure, as shown in Example 4-6.

Example 4-6. eachEnabledContact closure

publicstaticvoid eachEnabledContact(Closure cls) {

Customer.allCustomers.findAll { customer ->

customer.enabled && customer.contract.enabled

}.each { customer ->

customer.contacts.each(cls)

}

}

Now, we can call Customer.eachEnabledContact({ contact -> contact.sendEmail(msg) }) and get our functionality. At this point, we have a nice set of functionality that we can call anytime we need to do something for all enabled contacts. For example, we might just want to create a list of all the enabled contacts.

Your boss has asked you to add functionality to change a Contact’s name and email, because people get married or have other life events requiring name changes. Now let’s assume that our application is actually threaded (maybe it’s a web server). If you don’t see an issue, you’re about to.

You just sat down to work, happy that you got the “change name and email” functionality done and rolled out. You get an email from your boss asking you to take a look at a new blocker bug: “Send email sometimes sends to an old email address.” The support team includes the broken email in the bug as well.

from: XXY Product Trials <trials@xxy.com>

to: Jane Doe <jdoe@company.com>

subject: New Product Trial

Hello Jane Smith,

We would like to let you know that a new product is available for you to try. Please feel free to give us a call at 1-800-555-1983 if you would like to see this product in action.

Sincerely, Your Friends at XXY

In the bug, the support team says Jane just got married and her name changed from Jane Doe to Jane Smith. The thing they can’t figure out is why the email went to Jane Doe <jdoe@company.com> but her name is referenced as Jane Smith in the body.

OK, before I break down the entire runtime, I’ll try to explain this. User A updates the user’s last name and email and clicks Save at the same time that another user clicks Send email. Because we have no synchronization, it’s possible for the name to be updated but not the email when the email is actually created. Let’s look at the simplified sequence of events in Table 4-1.

Table 4-1. Simplified user runtime

Step

User A

User B

1

Saves user name change

Clicks “Send email”

2

System updates last name

Unscheduled

3

Unscheduled

Sends email with inconsistent data

4

System updates email

Unscheduled

Concurrency means there is no guarantee that a shared variable will actually be in a specific state at any given time. How do you even reproduce concurrency bugs? How do you validate that you have actually fixed a concurrency bug?

We haven’t even looked at a more likely scenario: what happens if we have functionality to remove a Contact or a Customer? Now we might be iterating over our list and remove an item from the list. Let’s look at all of these issues in one fell swoop. There are two primary ways to fix our concurrency issue:

§ Synchronize all access to the Customer.allCustomers object.

§ Ensure that the Customer.allCustomers list and its members cannot be changed.

Our first option means that we must have a synchronized block for every possible access of the Customer.allCustomers object. Invariably someone will forget to do a synchronized access and break the entire paradigm.

Our second option is much better; anyone can write any accessor to the Customer.allCustomers variable without worrying about the list mutating. Of course, this means that we have to be able to generate new lists with updated members. This is the idea behind immutability.

Immutability

As we get deeper into immutability, think about database transactions. Database transactions are atomic, which means that the system is either in a pre-transaction or post-transaction state, never in a mid-transaction state.

This means that when a database transaction is committed, the new records are made available to new queries. Older queries are still using older data, which is fine because the functionality they were doing was predicated on the previous data.

MATH WARNING

I’m going to show that, if we have two good states, it’s better to be in one or the other, but we cannot ever be in both. Let’s begin by defining our function f(x,y). We also define that our two states (without the tick mark and with the tick mark) are not equal:

Math Warning

Math Warning

Let’s create a set of our known two good states:

Math Warning

So, this means that mixing the sets of parameters still works and still gives us a value; however, these are not values that exist in our set of good states.

Math Warning

Math Warning

So, we’re going to think about variables as placeholders within a specific scope. If we think back to our email issue, then, we know that we can operate only in a known good state on both the list and the Customer and Contact records themselves.

Let’s begin working on our fix by doing the simplest thing and making our Customer.allCustomers an immutable list. Remember, we’re not making the variable immutable, we’re making the thing the variable contains immutable. Let’s see this in Example 4-7.

Example 4-7. Mutable allCustomers list that will contain immutable Customer objects

staticpublic List<Customer> allCustomers = new ArrayList<Customer>();

That was simple enough, but now we have to deal with our eachEnabledContact, right? Actually, we don’t have to do anything, because it was read-only functionality.

Let’s continue our momentum and make all fields of the Customer object immutable. Again, this is fairly straightforward, as we make all fields final with one caveat: we must have a constructor that sets every field, as shown in Example 4-8.

Example 4-8. Immutable Customer object

publicfinal Integer customer_id = 0;

publicfinal String name = "";

publicfinal String state = "";

publicfinal String domain = "";

publicfinal Boolean enabled = true;

publicfinal Contract contract = null;

publicfinal List<Contact> contacts = new ArrayList<Contact>();

public Customer(Integer customer_id,

String name,

String state,

String domain,

Boolean enabled,

Contract contract,

List<Contact> contacts) {

this.customer_id = customer_id;

this.name = name;

this.state = state;

this.domain = domain;

this.enabled = enabled;

this.contract = contract;

this.contacts = contacts;

}

REMOVING SETTERS

Because we’re changing our fields to immutable, we must remove all setters. If you think about it, having setters for immutable fields is a fallacy in and of itself, because the fields can be set only when the object is created.

Next, let’s update our Contract class and make it immutable as well (Example 4-9). It is important to understand that as we do this, we will be unable to run and test the functionality until we’ve completed this refactor. Remember, our original code for updating a contract sets the field, which does not work with immutable variables.

Example 4-9. Immutable Contract class

importjava.util.List;

importjava.util.Calendar;

importjava.util.concurrent.ThreadPoolExecutor;

importjava.util.concurrent.TimeUnit;

importjava.util.concurrent.LinkedBlockingQueue;

publicclassContract {

publicfinal Calendar begin_date;

publicfinal Calendar end_date;

publicfinal Boolean enabled = true;

public Contract(Calendar begin_date, Boolean enabled) {

this.begin_date = begin_date;

this.end_date = this.begin_date.getInstance();

this.end_date.setTimeInMillis(this.begin_date.getTimeInMillis());

this.end_date.add(Calendar.YEAR, 2);

this.enabled = enabled;

}

}

Even though we know we need to update setContractForCustomerList, we’re going to switch from a concurrent design for now. Instead, we’ll create a new constructor, as shown in Example 4-10, so that we can create a new object with all members set.

Example 4-10. Constructor for the Contract class

public Contract(Calendar begin_date, Calendar end_date, Boolean enabled) {

this.begin_date = begin_date;

this.end_date = end_date;

this.enabled = enabled;

}

Now, let’s go ahead and update our setContractForCustomerList method so that we can get things working again. We’ll want to map over our allCustomers list, updating customers that have specific ids. All of this is shown in Example 4-11.

Example 4-11. setContractForCustomerList with map

publicstatic List<Customer> setContractForCustomerList(List<Integer> ids,

Boolean status) {

Customer.allCustomers.collect { customer ->

if(ids.indexOf(customer.customer_id) >= 0) {

new Customer(

customer.customer_id,

customer.name,

customer.state,

customer.domain,

customer.enabled,

new Contract(

customer.contract.begin_date,

customer.contract.end_date,

status

),

customer.contacts

)

} else {

customer

}

}

}

Some might think that this looks terrible, but it is a fantastic piece of code. We iterate over the list of objects, then check to see if the current customer_id is in our list of ids. If it is, we create a new customer, copying all the fields over except Contract. Instead, we create a newContract with the specific status that was passed to us. This new customer is then used in place of the original customer record. If it is not in our list, we return the original customer.

Let’s try to refactor this so that if we want to, we can change the Contract in any manner. We’ll add a method to Customer.java called updateContractForCustomerList, which will do the same thing as Example 4-11, except now we execute a higher-order function on the contract itself. We will then expect that a contract will be returned. Let’s look at the code in Example 4-12.

Example 4-12. updateContractForCustomerList function

publicstatic List<Customer> updateContractForCustomerList(List<Integer> ids,

Closure cls) {

Customer.allCustomers.collect { customer ->

if(ids.indexOf(customer.customer_id) >= 0) {

new Customer(

customer.customer_id,

customer.name,

customer.state,

customer.domain,

customer.enabled,

cls(customer.contract),

customer.contacts

)

} else {

customer

}

}

}

Now, we update our original setContractForCustomerList function in Contract.java to call into Customer.updateContractForCustomerList, as shown in Example 4-13. We are returning a List of Customers, so we are able to execute Customer.allCustomers = Contract.setContractForCustomerList(…), which provides us with a constant, pristine list.

Example 4-13. setContractForCustomerList function, which references updateContractForCustomerList

publicstatic List<Customer> setContractForCustomerList(List<Integer> ids,

Boolean status) {

Customer.updateContractForCustomerList(ids, { contract ->

new Contract(contract.begin_date, contract.end_date, status)

})

}

Remember how I mentioned an update contact method earlier? This was the entire reason for our bug; let’s go ahead and update that method so that we can fix the broken code, which is still trying to update objects.

In Example 4-14, we’ll see our new updateContact method, which will map or collect all the Customer records.

Example 4-14. updateContactFor using an immutable list

publicstatic List<Customer> updateContact(Integer customer_id,

Integer contact_id,

Closure cls) {

Customer.allCustomers.collect { customer ->

if(customer.customer_id == customer_id) {

new Customer(

customer.customer_id,

customer.name,

customer.state,

customer.domain,

customer.enabled,

customer.contract,

customer.contacts.collect { contact ->

if(contact.contact_id == contact_id) {

cls(contact)

} else {

contact

}

}

)

} else {

customer

}

}

}

But wait: we’re starting to repeat ourselves, so let’s remember DRY and see what we can abstract. Take a few minutes to work on it yourself, and then check Example 4-15 to see what I did.

Example 4-15. Refactoring to abstract the looping methodology

publicstatic List<Customer> updateCustomerByIdList(List<Integer> ids,

Closure cls) {

Customer.allCustomers.collect { customer ->

if(ids.indexOf(customer.customer_id) >= 0) {

cls(customer)

} else {

customer

}

}

}

publicstatic List<Customer> updateContact(Integer customer_id,

Integer contact_id,

Closure cls) {

updateCustomerByIdList([customer_id], { customer ->

new Customer(

customer.customer_id,

customer.name,

customer.state,

customer.domain,

customer.enabled,

customer.contract,

customer.contacts.collect { contact ->

if(contact.contact_id == contact_id) {

cls(contact)

} else {

contact

}

}

)

})

}

publicstatic List<Customer> updateContractForCustomerList(List<Integer> ids,

Closure cls) {

updateCustomerByIdList(ids, { customer ->

new Customer(

customer.customer_id,

customer.name,

customer.state,

customer.domain,

customer.enabled,

cls(customer.contract),

customer.contacts

)

})

}

Conclusion

Most people believe that moving to immutable variables will increase the complexity of their code; however, it actually helps in many different ways. Tracking down bugs—because we know certain variables cannot change—becomes easier; we can better understand what might have been passed into and out of functions.

Immutability is a difficult technique to implement because you will most likely need to do large refactorings in order to accomplish it. Just look back at our conversion of the Customer object; we actually had to make changes to other classes and methods to support this. The key to implementing immutability is to start on your new classes and work backward during downtime to refactor your old code. Start with smaller classes that don’t change much and then move on to your harder classes.