Pragmatic Unit Testing in Java 8 with JUnit (2015)

Part 3. The Bigger Design Picture

Man and Woman cannot live by bread alone. Or by unit testing alone. Unit testing is just another part of a bigger picture that you can simply refer to as “design.” You want to ensure that your code’s design stays clean as you build your system, so you’ll learn about how the practice of refactoring is afforded by good unit tests. To refactor effectively, you in turn need to understand what a good, bigger design looks like. You’ll also discover that some things are hard to test, so you’ll read about mock objects and how they help isolate your tests from inevitable difficult dependencies. Finally, you want to ensure that your tests continue to return on value, so we’ll step through taking a difficult test and whittling it into one that costs less to maintain.

Chapter 8. Refactoring to Cleaner Code

Our systems are bloated! You can pick almost any system at random and spot obvious bits of rampant duplication—whether it’s a hundred-line-long method that’s almost a complete replication from another class or a few lines of utility code repeated megaumpteen times throughout. The cost of such duplication is significant: every piece of code duplicated increases the cost to maintain it, as well as the risk in making a change. You want to minimize the amount of duplication in your system’s code.

The cost of understanding code is also significant. A change requiring ten minutes of effort in clear, well-structured code can require hours of effort in convoluted, muddy code. You want to maximize the clarity in your system’s code.

You can accomplish both goals—low duplication and high clarity—at a reasonable cost and with a wonderful return on investment. The good news is that having unit tests can help you reach the goals. In this chapter you’ll learn how to refactor your code with these ideals in mind.

A Little Bit o’ Refactor

If you’ve recently arrived from Proxima Centauri in a slow warp drive that required fifteen years of travel time, you might not have heard the term refactoring. Otherwise, you at least recognize it from the menus in your IDE. You might even be aware that refactoring your code means you’re transforming its underlying structure while retaining its existing functional behavior.

In other words, refactoring is moving code bits around and making sure the system still works. Willy-nilly restructuring of code sounds risky! By gosh, you really want to make sure you have appropriate protection when doing so. You know…tests.

An Opportunity for Refactoring

Let’s revisit the iloveyouboss code. You wrote a couple of tests with us for it back in Chapter 2, Getting Real with JUnit. As a reminder, here’s the core matches() method from the Profile class:

iloveyouboss/16/src/iloveyouboss/Profile.java
	public boolean matches(Criteria criteria) {
	score = 0;

	boolean kill = false;
	boolean anyMatches = false;
	for (Criterion criterion: criteria) {
	Answer answer = answers.get(
	criterion.getAnswer().getQuestionText());
	boolean match =
	criterion.getWeight() == Weight.DontCare \|\|
	answer.match(criterion.getAnswer());
	if (!match && criterion.getWeight() == Weight.MustMatch) {
	kill = true;
	}
	if (match) {
	score += criterion.getWeight().getValue();
	}
	anyMatches \|= match;
	}
	if (kill)
	return false;
	return anyMatches;
	}

The method isn’t particularly long, weighing in at around a dozen total lines of expressions and/or statements. Yet it’s reasonably dense, embodying quite a bit of logic. We were able to add five more test cases behind the scenes.

Extract Method: Your Second-Best Refactoring Friend

(Okay, we’ll kill the mystery before you go digging in the index.… Your best refactoring friend is rename, whether it be a class, method, or variable of any sort. Clarity is largely about declaration of intent, and good names are what impart clarity best in code.)

Our goal: reduce complexity in the matches() method so that we can readily understand what it’s responsible for—its policy. We do that in part by extracting detailed bits of logic to new, separate methods.

Conditional expressions often read poorly, particularly when they are complex. An example is the assignment to match that appears in the for loop in matches():

iloveyouboss/16/src/iloveyouboss/Profile.java
	for (Criterion criterion: criteria) {
	Answer answer = answers.get(
	criterion.getAnswer().getQuestionText());
*	boolean match =
*	criterion.getWeight() == Weight.DontCare \|\|
*	answer.match(criterion.getAnswer());
	// ...
	}

Isolate the complexity of the assignment by extracting it to a separate method. You’re left with a simple declaration in the loop: the match variable represents whether or not the criterion matches the answer:

iloveyouboss/17/src/iloveyouboss/Profile.java
	public boolean matches(Criteria criteria) {
	score = 0;

	boolean kill = false;
	boolean anyMatches = false;
	for (Criterion criterion: criteria) {
	Answer answer = answers.get(
	criterion.getAnswer().getQuestionText());
*	boolean match = matches(criterion, answer);

	if (!match && criterion.getWeight() == Weight.MustMatch) {
	kill = true;
	}
	if (match) {
	score += criterion.getWeight().getValue();
	}
	anyMatches \|= match;
	}
	if (kill)
	return false;
	return anyMatches;
	}

*	private boolean matches(Criterion criterion, Answer answer) {
*	return criterion.getWeight() == Weight.DontCare \|\|
*	answer.match(criterion.getAnswer());
*	}

If you need to know the details of how a criterion matches an answer, you can navigate into the newly extracted matches() method. Extracting lower-level details removes distracting clutter if you need only understand the high-level policy for how a Profile matches against a Criteria object.

It’s way too easy to break functionality when moving code about. You need the confidence to know that you can change code and not introduce sneaky little defects that aren’t discovered until much later.

Fortunately, the tests written for Profile (see Chapter 2, Getting Real with JUnit) begin to provide you with the confidence you need. With each small change, you run your fast set of tests—it’s cheap, easy, and fun.

The ability to move code about safely is one of the most important benefits of unit testing. It allows you to add new features safely, and it also allows you to make changes that keep the design in good shape. In the absence of sufficient tests, you’ll tend to make fewer changes. Or you’ll make changes that are highly risky.

Finding Better Homes for Our Methods

Our loop is a bit easier to read—great! But we note that the newly extracted code in matches() doesn’t have anything to do with the Profile object itself. It seems that either the Answer class or the Criterion class could be responsible for determining when one matches another.

Move the newly extracted matches() method to the Criterion class. Criterion objects already know about Answer objects, but the converse is not true—Answer is not dependent on Criterion. If you were to move matches() to Answer, you’d have a bidirectional dependency. Not cool.

Here’s matches() in its new home:

iloveyouboss/18/src/iloveyouboss/Criterion.java
	public class Criterion implements Scoreable {
	// ...
	public boolean matches(Answer answer) {
	return getWeight() == Weight.DontCare \|\|
	answer.match(getAnswer());
	}
	}

And here’s what the loop looks like after the move:

iloveyouboss/18/src/iloveyouboss/Profile.java
	for (Criterion criterion: criteria) {
	Answer answer = answers.get(
	criterion.getAnswer().getQuestionText());
*	boolean match = criterion.matches(answer);

	if (!match && criterion.getWeight() == Weight.MustMatch) {
	kill = true;
	}
	if (match) {
	score += criterion.getWeight().getValue();
	}
	anyMatches \|= match;
	}

The statement that assigns into the answer local variable is quite a mouthful:

iloveyouboss/18/src/iloveyouboss/Profile.java
	Answer answer = answers.get(
	criterion.getAnswer().getQuestionText());

It suffers for violating the Law of Demeter (which roughly says to avoid chaining together method calls that ripple through other objects), and it’s simply not clear.

A first step toward improving things is to extract the right-hand-side expression of the answer assignment to a new method whose name, answerMatching(), better explains what’s going on:

iloveyouboss/19/src/iloveyouboss/Profile.java
	public boolean matches(Criteria criteria) {
	score = 0;

	boolean kill = false;
	boolean anyMatches = false;
	for (Criterion criterion: criteria) {
*	Answer answer = answerMatching(criterion);
	boolean match = criterion.matches(answer);

	if (!match && criterion.getWeight() == Weight.MustMatch) {
	kill = true;
	}
	if (match) {
	score += criterion.getWeight().getValue();
	}
	anyMatches \|= match;
	}
	if (kill)
	return false;
	return anyMatches;
	}

*	private Answer answerMatching(Criterion criterion) {
*	return answers.get(criterion.getAnswer().getQuestionText());
*	}

Temporary variables have a number of uses. You might be more accustomed to temporaries that cache the value of an expensive computation or collect things that change throughout the body of a method. The answer temporary variable does neither, but another use of a temporary variable is to clarify the intent of code—a valid choice even if the temporary is used only once.

Automated and Manual Refactorings

In our case, the answer local variable doesn’t clarify the code, and it’s used only once. Inline (remove) the variable by replacing its use with the answerMatching(criterion) expression:

iloveyouboss/20/src/iloveyouboss/Profile.java
	for (Criterion criterion: criteria) {
*	boolean match = criterion.matches(answerMatching(criterion));

	if (!match && criterion.getWeight() == Weight.MustMatch) {
	kill = true;
	}
	if (match) {
	score += criterion.getWeight().getValue();
	}
	anyMatches \|= match;
	}

You could manually inline answer, but your IDE most likely automates the inline refactoring. In Eclipse, select Refactor ▶ Inline… from the main menu to inline.

The very existence of automated IDE automated should reinforce the idea that refactorings are code transformations that don’t affect functional behavior. Peruse the refactoring menu in your IDE of choice. Any good IDE automates well over a dozen common transformations. Learn and use them—you’ll save countless hours over coding the transforms yourself, and even more hours over fixing the mistakes you’ll make refactoring manually.

Lucky you: fifteen years ago, Java programmers manually moved bits of code about in highly unsafe ways. Today, the beauty of automated refactoring can’t be overstated. You get to watch the computer do the dirty work and know that your code still works.

With some of the detail out of the way in the matches() method, we now have an easier time understanding its high-level policy. We can piece apart the core goals of the method:

· It calculates the total score by summing the weights of matching criteria.

· It returns false when any must-match criterion does not match the corresponding profile answer.

· It returns true if there are otherwise any matches, false if there are not.

Let’s restructure matches() to more clearly state these three core intents. Last things first. Change the return statement from returning the value of anyMatches to instead return the result of a Boolean method, anyMatches(). Find the four lines of code in matches() that determine the result of whether or not there are any matches, and move them into the anyMatches() method:

iloveyouboss/20-misadventure/src/iloveyouboss/Profile.java
	public boolean matches(Criteria criteria) {
	score = 0;

	boolean kill = false;
	for (Criterion criterion: criteria) {
	boolean match = criterion.matches(answerMatching(criterion));

	if (!match && criterion.getWeight() == Weight.MustMatch) {
	kill = true;
	}
	if (match) {
	score += criterion.getWeight().getValue();
	}
	}
	if (kill)
	return false;
*	return anyMatches(criteria);
	}

*	private boolean anyMatches(Criteria criteria) {
*	boolean anyMatches = false;
*	for (Criterion criterion: criteria)
*	anyMatches = criterion.matches(answerMatching(criterion));
*	return anyMatches;
*	}

Every refactoring requires you to rerun the tests. This refactoring is riskier because there’s no automated way to gather disjoint lines of code into a new method, so you must do things manually. Indeed, we have a failing test:

iloveyouboss/20-misadventure/test/iloveyouboss/ProfileTest.java
	@Test
	public void matchAnswersTrueWhenAnyOfMultipleCriteriaMatch() {
	profile.add(answerThereIsRelocation);
	profile.add(answerDoesNotReimburseTuition);
	criteria.add(new Criterion(answerThereIsRelocation, Weight.Important));
	criteria.add(new Criterion(answerReimbursesTuition, Weight.Important));

	boolean matches = profile.matches(criteria);

	assertTrue(matches);
	}

The fix is to use the compound assignment operator (|=) when updating the value of anyMatches (apparently the | character slipped through the cracks when we manually constructed the assignment statement):

iloveyouboss/21/src/iloveyouboss/Profile.java
	private boolean anyMatches(Criteria criteria) {
	boolean anyMatches = false;
	for (Criterion criterion: criteria)
*	anyMatches \|= criterion.matches(answerMatching(criterion));
	return anyMatches;
	}

Oops, in any case. Simple mistakes are easy to make when you change any code manually. For this reason, always prefer using your IDE’s automated refactoring tools if you can. Also, be happy you have tests, and honor them by running them all the time when refactoring.

In any case, it’s possible that you’re mildly concerned about that method extraction and its performance implications. Hang in there.

Taking Refactoring Too Far?

Similarly, extract the code that calculates the total weighting of all matches:

iloveyouboss/22/src/iloveyouboss/Profile.java
	public boolean matches(Criteria criteria) {
*	calculateScore(criteria);

	boolean kill = false;
	for (Criterion criterion: criteria) {
	boolean match = criterion.matches(answerMatching(criterion));
	if (!match && criterion.getWeight() == Weight.MustMatch) {
	kill = true;
	}
	}
	if (kill)
	return false;
	return anyMatches(criteria);
	}

*	private void calculateScore(Criteria criteria) {
*	score = 0;
*	for (Criterion criterion: criteria)
*	if (criterion.matches(answerMatching(criterion)))
*	score += criterion.getWeight().getValue();
*	}

Double hmm. You might be wondering if we’re headed toward trouble.

Finally, extract the logic that determines whether or not there are any must-meet criteria that aren’t a match:

iloveyouboss/23/src/iloveyouboss/Profile.java
	public boolean matches(Criteria criteria) {
	calculateScore(criteria);
*	if (doesNotMeetAnyMustMatchCriterion(criteria))
*	return false;
	return anyMatches(criteria);
	}

*	private boolean doesNotMeetAnyMustMatchCriterion(Criteria criteria) {
*	for (Criterion criterion: criteria) {
*	boolean match = criterion.matches(answerMatching(criterion));
*	if (!match && criterion.getWeight() == Weight.MustMatch)
*	return true;
*	}
*	return false;
*	}

Harumph. Three new methods, three new loops. Are we kidding? We’re not. Let’s discuss the performance implications, but first let’s see what benefits we gain by having three methods.

The Reward: Clear, Testable Units

The matches() method now clearly explains the overall algorithm in a form that you can almost instantly digest. You can almost read the code as-is to step through the algorithm:

· Calculate the score given the criteria.

· Return false if the profile does not meet any must-match criterion from the criteria.

· Otherwise, return the result of whether or not there are any matches given the criteria.

The prior version of the code required more careful reading and created more opportunities for confusion about the intent of matches().

The implementation details for each of the three steps in the algorithm are hidden in the corresponding helper methods calculateScore(), doesNotMeetAnyMustMatchCriterion(), and anyMatches(). Each helper method allows the necessary behavior to be expressed in a concise, isolated fashion, not cluttered with other concerns.

The Performance Anxiety: Oh No You Di-n’t!

Some of you dear readers are no doubt perturbed. After refactoring of the matches() method, each of anyMatches(), calculateScore(), and doesNotMeetAnyMustMatchCriterion() iterates through the criterion collection. Three new loops—we have potentially quadrupled the time to execute thematches() method.

To which we say, “So what?”

If you can respond to that obnoxious question with an answer relevant to our real requirements, we’ll listen. Otherwise, stop worrying about it. Yes, performance is important. But is the refactored code now incapable of meeting performance expectations?

Stop—you can’t answer that question. We (or our customer) might be able to (since we’re making up requirements as we go). Maybe we expect modest volume and don’t care about the possible performance degradation. Yet. Or maybe the code doesn’t perform as badly as you might guess. It’s also possible that we need to process a few million profiles, and performance is of the utmost consideration.

If performance isn’t an immediate problem, invest in keeping the code clean instead of wasting time with premature optimization efforts. Optimized code is more challenging in so many ways: it usually makes the code more difficult, increasing maintenance costs, and it usually makes the design less flexible.

In contrast, a clean design is the best protection against the sudden need to optimize for performance. A clean design often provides more flexibility to move code around and try different algorithms.

images/aside-icons/tip.png

A clean design is your best preparation for optimization.

If we think performance is a problem right now: before we do anything else, we need to measure how bad things are with a performance test (see Right-BICE[P]: Performance Characteristics). We can then write a little test code that tells us how fast the old code was and compare the performance to the refactored code to determine the percentage degradation.

Right now, the code in matches() clearly states what’s going on. But it also poses some concerns about the bigger design picture—for example, does the Profile class now do too much? Next chapter, we’ll explore where our design falls flat, and we’ll use our tests again to get things back on track.

After

It’s easy to write a lot of code quickly. It’s just as easy to let that code get dirty, to the point where it becomes difficult to step through. Unit tests provide the safeguards you need to clean up code messes without breaking things. In this chapter you learned techniques for keeping your system clean continually, which will enable you to stem much of the rot inevitable in your system.

As you begin to sweep away the small bits of dust in your system, you’ll start to see larger design concerns. Next up, you’ll learn how to lean on unit tests again to address these larger design concerns.