An Introduction to Test- and Behavior-Driven Development - Test-Driven Infrastructure with Chef (2011)

Test-Driven Infrastructure with Chef (2011)

Chapter 5. An Introduction to Test- and Behavior-Driven Development

The Principles of TDD and BDD

In Chapter 1, I argued that, to mitigate against the risks of adopting the infrastructure as code paradigm, systems should be in place to ensure that our code produces the environment needed, and to ensure that our changes have not caused side effects that alter other aspects of the infrastructure.

What we’re describing here is automated testing. In his book Managing Software Debt: Building for Inevitable Change (Addison-Wesley), Chris Sterling uses the phrase “a supportable structure for imminent change” to describe what I am calling for. Particularly as infrastructure developers, we have to expect our systems to be in a state of flux. We may need to add components to our systems, refine the architecture, tweak the configuration, or resolve issues with its current implementation. When making these changes using Chef, we’re effectively doing exactly what a traditional software developer does in response to a bug or feature request. As complexity and size grow, it becomes increasingly important to have safe ways to support change. The approach I’m recommending has its roots firmly in the historic evolution of best practices in the software development world.

A Very Brief History of Agile Software Development

By the end of the 1990s, the software industry did not enjoy a particularly good reputation—across four critical areas, customers were feeling let down. Firstly, the perception (and expectation, and experience) was often that software would be delivered late and over budget. Secondly, despite a lengthy cycle of requirement gathering, analysis, design, implementation, testing, and deployment, it was not uncommon for the customer to discover that this late, expensive software didn’t really do what was needed. Whether this was due to a failure in initial requirement-gathering or a shift in needs over the lifecycle of the software’s development wasn’t really the point—the software didn’t fully meet the customer’s requirements. Thirdly, a frequent complaint was that, once live and a part of the critical business processes, the software itself was unstable or slow. Software that fails under load or crashes every few hours is of negligible value, regardless of whether it has been delivered on budget, on time, and meeting the functional requirements. Finally, ongoing maintenance of the software was very costly. An analysis of this led to a recognition that the later in the software lifecycle that problems were identified or new requirements emerged, the more expensive they were to service.

In 2001, a small group of professionals got together to try to tackle some tough questions about why the software industry was so frequently characterized by failed projects and an inability to deliver quality code, on time and in budget. Together they put gathered a set of ideas that began to revolutionize the software development industry. Thus began the Agile movement. Its history and implementations are outside the scope of this book, but the key point is that more than a decade ago, professional developers started to put into practice approaches to tackle the seemingly inherent problems of the business of writing software.

Now, I’m not suggesting that the state of infrastructure code today is as bad as the software industry in the late 90s. However, if we’re to deliver infrastructure code that is of high quality, easy to maintain, reliable, and delivers business value, I think it stands to reason that we must take care to learn from those who have already put mechanisms in place to help solve some of the problems we’re facing today.

Test-Driven Development

Out of the Agile movement emerged a number of core practices that were felt to be important to guarantee not only quality software but also an enjoyable working experience for developers. Ron Jeffries summarizes these excellently in his article introducing Extreme Programming, one of a family of Agile approaches that emerged in the early 2000s. Some of these practices can be introduced as good habits, and don’t require much technology to support their implementation. Of this family, the practice most crucial for creating a supportable structure for imminent change, providing insurance and warning against unwanted side effects, is that of test-driven development (TDD). For infrastructure developers, the practice is both the most difficult to introduce and implement, and also the one that promises the biggest return on investment.

TDD is a widely adopted way of working that facilitates the creation of highly reliable and maintainable code. The philosophy of TDD is encapsulated in the phrase Red, Green, Refactor. This is an iterative approach that follows these six steps:

1. Write a test based on requirements.

2. Run the test and watch it fail.

3. Write the simplest code you can to make the test pass.

4. Run the test and watch it pass.

5. Improve the code as required to make it perform well, be readable, and reusable, but without changing its behavior.

6. Repeat the cycle.

Kent Beck and Cynthia Andres, in Extreme Programmng Explained (Addison-Wesley), suggest this way of working brings benefits in four clear areas:

1. It helps prevent scope from growing. We write code only to make a failing test pass.

2. It reveals design problems. If the process of writing the test is laborious, that’s a sign of a design issue; loosely coupled, highly cohesive code is easy to test.

3. It builds trust. The ongoing, iterative process of demonstrating clean, well-written code, with intent indicated by a suite of targeted, automated tests, builds trust with team members, managers, and stakeholders.

4. It helps programmers get into a rhythm. Test, code, refactor—a rhythm that is at once productive, sustainable, and enjoyable.

Behavior-Driven Development

However, in 2007, a group of Agile practitioners, including Dan North and Dave Astels, started rocking the boat with presentations and tool development work. Their key observation seemed to be that it’s perfectly possible to write high quality, well-tested, reliable, and maintainable code, and miss the point altogether. As software developers, we are employed not to write code, but to help our customers to solve problems. In practice, the problems we solve pretty much always fit into one of three categories:

1. Help the customer make more money.

2. Help the customer spend less money.

3. Help the customer protect the money they already have.

Around this recognition grew up an evolution of TDD focused specifically around helping developers write code that matters. Just as TDD proved to be a hugely effective tool in enhancing the technical quality of software, behavior-driven development (BDD) set out to enhance the success with which software fulfilled the business’ need.

The shift from TDD to BDD is subtle but significant. Instead of thinking in terms of verification of a unit of code, we think in terms of a specification of how that code should behave—what it should do. Our task is to write a specification of system behavior that is precise enough for it to be executed as code.

Importantly, BDD is about conversations. The whole point of BDD is to ensure that the real business objectives of stakeholders get met by the software we deliver. If stakeholders aren’t involved, if discussions aren’t taking place, BDD isn’t happening. BDD yields benefits across many important areas.

Building the right thing

BDD helps to ensure that the right features are built and delivered the first time. By remembering the three categories of problems that we’re typically trying to solve, and by beginning with the stakeholders—the people who are actually going to be using the software we write—we are able to clearly specify what the most important features are, and arrive at a definition of done that encapsulates the business driver for the software.

Reducing risk

BDD also reduces risk—the risk that, as developers, we’ll go off at a tangent. If our focus is on making a test pass, and that test encapsulates the customer requirement in terms of the behavior of the end result, the likelihood that we’ll get distracted or write something unnecessary is greatly reduced. Interestingly, a suite of acceptance tests developed this way, in partnership with the stakeholder, also forms an excellent starting point for monitoring the system throughout its lifecycle. We know how the system should behave, and if we can automate tests that prove the system is working according to specification, and put alerts around them (both in the development process so we capture defects, and when live so we can resolve and respond to service degradation), we have grounded our monitoring in the behavior of the application that the stakeholder has defined as being of paramount importance to the business.

Evolving design

It also helps us to think about the design of the system. The benefits of writing unit tests to increase confidence in our code are pretty obvious. Maturing to the point that we write these tests first helps us focus on writing only the code that is explicitly needed. The tests also serve as a map to the code and offer lightweight documentation. By tweaking our approach towards thinking about specifying behavior rather than testing classes and methods, we come to appreciate test-driven development as a practice that helps us discover how the system should work, and molds our thinking towards elegant solutions that meet the requirements.

How does all of this relate to infrastructure as code? Well, as infrastructure developers, we are providing the underlying systems that make it possible to deliver software effectively. This means our customers are often application developers or test and QA teams. Of course, our customers are also the end users of the software that runs on our systems, so we’re responsible for ensuring our infrastructure performs well and remains available when needed. Having accepted that we need some kind of mechanism for testing our infrastructure to ensure it evolves rapidly without unwanted side effects, bringing the principle of BDD into the equation helps us to ensure that we’re delivering business value by providing the infrastructure that is actually needed. We can avoid wasting time pursuing the latest and greatest technology by realizing we could meet the requirements of the business more readily with a simpler and established solution.

TDD and BDD with Ruby

Ruby has always been a language in which testing, and particularly testing up-front, has been popular. Also, the development community around Ruby has historically been particularly positive about Agile software development in general, and as such has spawned a great many creative and powerful testing tools and frameworks. I think it’s fair to say that as a language and environment in which to work, Ruby is probably the best served for libraries, tools, and frameworks. Within this ecosystem, I’m going to discuss three tools that, when used together, provide a full coverage of testing capabilities, from the lowest to the highest level—Cucumber, RSpec, and Minitest.

As this is a book about test-driven infrastructure development, I’m going to make sure we’ve got a reasonable understanding of testing in general and test-first development, before we go on to discuss writing infrastructure code using Chef.

For the purposes of the exercise, we’re going to write a Ruby class that assesses whether a team member is a hipster. (I’m guessing everyone knows what a hipster is by now, but there’s always Google if you don’t!)

Minitest: Unit Testing for the 21st Century

A unit test is pretty much the simplest and lowest level kind of test we can write. It is designed to verify whether a precise, small, tightly defined piece of functionality behaves as it should. Typically, a unit test will exercise a single method. The seminal unit testing framework was JUnit. Conceived by Kent Beck and Erich Gamma, it built on SUnit, written by Kent Beck for Smalltalk. JUnit quickly became the standard approach to unit testing, to the extent that the term xUnit began to appear to describe a test framework in any language that broadly implemented the same approach to unit testing. Ruby’s xUnit implementation was Test::Unit.

The pattern is pretty much always the same. You create a class as a subclass of Test::Unit::TestCase, write methods beginning with the word test, set up some state to exercise a method, and make an assertion about what the method should do.

We’ll set the background with a little history lesson, and look at the original and most basic unit testing capabilities of Ruby—the faithful old workhorse Test::Unit.

First, ensure that the test-unit gem is installed:

$ gem install test-unit

Create a directory for the project and a file for a test:

$ mkdir tdd-principles $ cd tdd-principles $ touch test_hipster.rb

Now, let’s write a very simple test using the traditional test/unit approach:

require "test/unit"

class HipsterTest < Test::Unit::TestCase

def setup @developer = HipsterAssessor.new(gears_on_bike=1) end

def test_has_fixie? assert_equal true, @developer.has_fixie? end

end

We’re setting up the test by creating an instance of the HipsterAssessor, and passing in that the developer we are assessing has a single gear on their bicycle. We’re going to test the has\_fixie? method, and we’re setting up the expectation that the method will return true.

Let’s run the test:

$ ruby test_hipster.rb Loaded suite test_hipster Started E Finished in 0.000294 seconds.

1) Error: test_has_fixie?(HipsterTest): NameError: uninitialized constant HipsterTest::HipsterAssessor test_hipster.rb:6:in `setup'

1 tests, 0 assertions, 0 failures, 1 errors

This is the standard approach for test-first programming. We’ve written the test. The test has failed. Now we make it pass. In this case the test wasn’t able to run yet—it errored out because we’re trying to instantiate a HipsterAssessor, but we’ve not written the code for that yet, nor is it available to the test. Let’s fix that by creating a new file called hipsterassessor.rb, which contains the following:

class HipsterAssessor end

And let’s require that file in our test by adding:

require './hipsterassessor'

Let’s run the test again:

$ ruby test_hipster.rb Loaded suite test_hipster Started E Finished in 0.000345 seconds.

1) Error: test_has_fixie?(HipsterTest): ArgumentError: wrong number of arguments (1 for 0) test_hipster.rb:7:in `initialize' test_hipster.rb:7:in `new' test_hipster.rb:7:in `setup'

1 tests, 0 assertions, 0 failures, 1 errors

Now the problem we have is that we’ve instantiated an assessor, but we’ve also passed in an argument to it, and our class definition doesn’t accommodate this. Let’s fix that by adding an initialize method, which takes an argument. We’re not going to do anything with the argument yet—at this point, we’re concerned with getting to the point where we see the test fail, not return an error.

class HipsterAssessor

def initialize(bike_gears)

end

end

Run the test again:

$ ruby test_hipster.rb Loaded suite test_hipster Started E Finished in 0.000322 seconds.

1) Error: test_has_fixie?(HipsterTest): NoMethodError: undefined method `has_fixie?' for nil:NilClass test_hipster.rb:11:in `test_has_fixie?'

1 tests, 0 assertions, 0 failures, 1 errors

Right now the error is that we haven’t written the has_fixie? method. Let’s go ahead and write that, but without an implementation. Your hipsterassessor.rb should look like this now:

class HipsterAssessor

def initialize(bike_gears) end

def has_fixie? end

end

And running the test should now result in a failure, not an error:

$ ruby test_hipster.rb Loaded suite test_hipster Started F Finished in 0.011201 seconds.

1) Failure: test_has_fixie?(HipsterTest) [test_hipster.rb:11]: <true> expected but was <nil>.

1 tests, 1 assertions, 1 failures, 0 errors

Alright, we’re getting somewhere. The test expected the method to return true, but since we’ve not written the code yet, we got nil. Now let’s write the actual code:

class HipsterAssessor

def initialize(bike_gears) @gears = bike_gears end

def has_fixie? @gears == 1 end

end

And finally, run the test and see it pass:

$ ruby test_hipster.rb Loaded suite test_hipster Started . Finished in 0.000259 seconds.

1 tests, 1 assertions, 0 failures, 0 errors

So, that’s an example of a simple unit test. Obviously these tests can get a lot more complicated, but other than adding methods that set up some state and ensure that state is no longer there at the end of the test, there’s not much more to Test::Unit. The end result is that while Test::Unitis by far the most widely used test tool in the wild, it’s almost never used by itself. Most Ruby projects will pull in a large number of additional Rubygems to provide more advanced testing capabilities—test randomization, allowing more natural test descriptions, and adding the ability to set up ephemeral test fixtures to allow complex, time-consuming, or third-party libraries or processes.

In Ruby 1.9, Minitest replaced Test::Unit, built into the standard library for Ruby 1.9. This clears out some of the old and rarely used cruft from Test::Unit, and brings powerful, modern testing functionality right into the standard library. An important thing to note about Minitest is that the version built into Ruby lags considerably behind the current latest version; indeed the version of Minitest built into my version of Ruby is 2.5.1. I would typically recommend you install the latest version from Rubygems, but at the time of writing, Minitest 5.0.0 has only just been released with a number of breaking changes. For this reason, in the present work, I recommend taking advantage of the so-called PessimisticVersionConstraint.

In the Gemfile, set your Minitest line to the following:

gem 'minitest', '~> 4.7'

This will ensure that you stay on the version below the major 5.0 breaking release.

The newer Minitest syntax is backwards-compatible with Test::Unit, but the superclass has a different name. Let’s convert it:

require 'minitest/autorun' require_relative 'hipsterassessor'

class HipsterTest < MiniTest::Unit::TestCase

def setup @developer = HipsterAssessor.new(gears_on_bike=1) end

def test_has_fixie? assert_equal true, @developer.has_fixie? end

end

$ ruby test_hipster.rb Run options: --seed 37275

# Running tests:

.

Finished tests in 0.000873s, 1145.4754 tests/s, 1145.4754 assertions/s.

1 tests, 1 assertions, 0 failures, 0 errors, 0 skips

You’ll notice right away that the test is much faster. You might also notice the --seed run option. This is because Minitest runs your tests in a random order to prevent you getting into the situation where your tests pass because of so-called state leakage—i.e., some state remained from the previous test, which allowed the subsequent test to pass. Randomizing the order of the tests catches this. The seed is the number used to initialize a pseudorandom number generator, which provides the randomness upon which Minitest bases its decision about ordering. You can reproduce the same state by passing the same seed manually.

That concludes our whirlwind tour of traditional, backwards-compatible xUnit-style unit testing. Although you’re fairly unlikely to test your infrastructure code using this traditional approach, it’s valuable to have some familiarity with it, and the general approach of iterating on failing tests until they pass is the same regardless of the testing framework being used.

RSpec: The Transition to BDD

I mentioned earlier that while there is great value in traditional unit testing, it’s still possible to write code that passes unit tests but doesn’t deliver value to the customer. Unit tests assert that the code behaves as it should, but what asserts how the code should behave? In order to be sure that we’re building code that matters, we need some kind of specification that describes what the code should do. This is exactly the transition that is made when we start to think about behavior-driven development against test-driven development. We first specify what the behavior should be, in a written form. We then test that the code behaves as specified (which will, of course, fail). We then make the tests pass, and check against the specification. A core principle of BDD is that this specification be code itself—that the description of how our software behaves should itself be executable.

RSpec was developed around a recognition that looking at low-level code, with not entirely obvious assertion syntax and class and method definitions, was not really the ideal vehicle for expressing and communicating the intended behavior of code. It was inspired by an early Thoughtworks tool, Agiledox, which converted code that looked like this:

public class CustomerLookupTest extends TestCase { testFindsCustomerById() { ... } testFailsForDuplicateCustomers() { ... } ... }

To a specification like this:

CustomerLookup - finds customer by id - fails for duplicate customers - ...

The effect is remarkable. Immediately the intention is clear, and the brain takes it in. RSpec’s output looks similar, and its input is more palatable.

Let’s write some specifications for the behavior of the HipsterAssessor.

First, let’s install the RSpec gem and create a directory to contain our specifications, called spec.

$ gem install rspec $ mkdir spec

Inside the spec directory, create a file called hipster_assessor_spec.rb with the following contents:

require 'rspec' require_relative '../hipsterassessor'

describe HipsterAssessor do context "assessing whether a developer is a hipster" do it "can establish if the developer has a fixed-wheel bicycle" do developer = HipsterAssessor.new(gears_on_bike=1) expect(developer.has_fixie?).to be_true end end end

The first line simply makes the RSpec gem available, and the second line makes our HipsterAssessor class available. The describe block describes in a high level domain-specific language (DSL) what the class should do, and in what context it functions. If you read the code as English, it makes pretty easy reading:

You: Describe the HipsterAssessor!

Me: In the context of assessing whether a developer is a hipster, it can establish if the developer has a fixed-wheel bicycle.

Let’s run the test:

$ rspec -fd spec/

HipsterAssessor assessing whether a developer is a hipster can establish if the developer has a fixed-wheel bicycle

Finished in 0.00048 seconds 1 example, 0 failures

This is much closer to describing the behavior of the code than just testing a method.

Let’s add another feature. I think the HipsterAssessor should give the developer a hipster score. To this end, it would be good to see the score and set the score.

I’m going to add the following:

it "reports a hipster assessment score" do developer = HipsterAssessor.new(gears_on_bike=1) expect(developer.score).to be_kind_of(Numeric) end

This gives us the following spec:

require 'rspec' require_relative '../hipsterassessor'

describe HipsterAssessor do context "assessing whether a developer is a hipster" do it "can establish if the developer has a fixed-wheel bicycle" do developer = HipsterAssessor.new(gears_on_bike=1) developer.has_fixie?.should == true end

it "reports a hipster assessment score" do developer = HipsterAssessor.new(gears_on_bike=1) expect(developer.score).to be_kind_of(Numeric) end end end

Rather than running the whole spec each time, during development it’s useful to use the -e, --example argument, which will only run the examples that match a given string:

$ rspec -fd -e "score" spec/ Run options: include {:full_description=>/score/}

HipsterAssessor assessing whether a developer is a hipster reports a hipster assessment score (FAILED - 1)

Failures:

1) HipsterAssessor assessing whether a developer is a hipster reports a hipster assessment score Failure/Error: expect(developer.score).to be_kind_of(Numeric) NoMethodError: undefined method `score' for #<HipsterAssessor:0x000000020bb128 @gears=1> # ./spec/hipster_assessor_spec.rb:13:in `block (3 levels) in <top (required)>'

Finished in 0.00052 seconds 1 example, 1 failure

Failed examples:

rspec ./spec/hipster_assessor_spec.rb:11 # HipsterAssessor assessing whether a developer is a hipster reports a hipster assessment score

The process should be familiar now. We need to write the code to make the test pass. We can make the test pass trivially simply by adding:

def score 10 end

Our test now passes:

$ rspec -fd -e "score" spec/ Run options: include {:full_description=>/(?-mix:score)/}

HipsterAssessor assessing whether a developer is a hipster reports a hipster assessment score

Finished in 0.00147 seconds 1 example, 0 failures

This is fine and meets our specification. This might seem a bit silly—surely the developer won’t always get a score of 10? Well, this is the point of BDD. We iterate quickly, and drive out the requirements. What’s wrong with score 10? Maybe it’s that it’s meaningless? Maybe it’s that it never varies? In which case we need to specify what the code should do. An important concept here is to ask the question, “What’s the next most important thing that the system does not currently do?” In our case, let’s say we want the score to vary depending on criteria. For example, let’s say that having a fixie scores five points, and then add another thing to test for, let’s say, empty spectacle frames. So, we add:

it "awards five points for having a fixie" do developer = HipsterAssessor.new(gears_on_bike=1) expect(developer.score).to eq 5 end

And run the test (again, using the -e argument), which shows:

$ rspec -fd -e "points" spec/ Run options: include {:full_description=>/points/}

HipsterAssessor assessing whether a developer is a hipster awards five points for having a fixie (FAILED - 1)

Failures:

1) HipsterAssessor assessing whether a developer is a hipster awards five points for having a fixie Failure/Error: expect(developer.score).to eq 5

expected: 5 got: 10

(compared using ==) # ./spec/hipster_assessor_spec.rb:18:in `block (3 levels) in <top (required)>'

Finished in 0.00112 seconds 1 example, 1 failure

Failed examples:

rspec ./spec/hipster_assessor_spec.rb:16 # HipsterAssessor assessing whether a developer is a hipster awards five points for having a fixie

Let’s change the code to make it pass. This requires a few changes, so I’ll now show the whole class to date:

class HipsterAssessor

def initialize(bike_gears) @gears = bike_gears @score = 0 end

def has_fixie? @gears == 1 end

def score if self.has_fixie? @score = @score + 5 end @score end end

Let’s run all the tests now. Our full spec looks like this:

require 'rspec' require_relative '../hipsterassessor'

describe HipsterAssessor do context "assessing whether a developer is a hipster" do it "can establish if the developer has a fixed-wheel bicycle" do developer = HipsterAssessor.new(gears_on_bike=1) expect(developer.has_fixie?).to be_true end

it "reports a hipster assessment score" do developer = HipsterAssessor.new(gears_on_bike=1) expect(developer.score).to be_kind_of(Numeric) end

it "awards five points for having a fixie" do developer = HipsterAssessor.new(gears_on_bike=1) expect(developer.score).to eq 5 end

end

end

And if we run RSpec, we get:

$ rspec -fd spec/

HipsterAssessor assessing whether a developer is a hipster can establish if the developer has a fixed-wheel bicycle reports a hipster assessment score awards five points for having a fixie

Finished in 0.00161 seconds 3 examples, 0 failures

The eagle-eyed amongst you will probably have noticed that our score increasing method will continue to add five points every time. Adding a test for this, and refactoring the code is just the sort of thing that would happen in real life. Of course, additionally, specs can get much more complex than this, but you should now appreciate the difference between behavior-driven and test-driven development.

In the previous section, we looked at Minitest as a drop-in replacement for Test::Unit. In addition to the speed improvements and general leanness of the tool, another addition is the inclusion of a spec-like DSL, which brings BDD into the core library. Let’s look at how we’d express the preceding test using Minitest rather than RSpec.

Rewriting the tests also gives us a chance to refactor. I don’t like that we have repeated instantiating the HipsterAssessor three times.

Both RSpec and Minitest support hooks to set up state before tests. This allows us to simplify the test. Here’s the equivalent code for Minitest:

require 'minitest/autorun'

class HipsterTest < MiniTest::Unit::TestCase

describe HipsterAssessor do

before do @developer = HipsterAssessor.new(gears_on_bike=1) end

describe "when assessing whether a developer is a hipster" do

it "can establish if the developer has a fixed-wheel bicycle" do @developer.has_fixie?.must_equal true end

it "can report a hipster assessment score" do @developer.score.must_be_instance_of Fixnum end

it "can award five points for having a fixie" do @developer.score.must_equal 5 end

end

end

end

I’ve added a before block, which sets up the state before each test. This is a pretty common pattern, and one you’ll see when we apply these principles to infrastructure code.

The syntax of Minitest is slightly different, and the matching and expectation grammar is not identical, but it’s clear that at this level of simplicity, Minitest can do exactly what RSpec does. Not having to include another gem makes this an attractive option. However, RSpec is very widely used, and in the context of testing infrastructure, the Chef community hasn’t yet settled on which it favors, so I’ve given a brief introduction to both. In terms of how this applies to testing infrastructure code, my feeling is that the community is equally undecided, and we’ll cover both when we look at writing tests for Chef recipes later in the book.

Let’s run the refactored test now, simply calling it with Ruby, now we’re using Minitest:

$ ruby test_hipster.rb Loaded suite test_hipster Started ... Finished in 0.000808 seconds.

3 tests, 3 assertions, 0 failures, 0 errors, 0 skips

Test run options: --seed 57543

Now that we’ve covered the basics of both RSpec and MiniTest::Spec, we’ll move on to examine Cucumber.

Cucumber: Acceptance Testing for the Masses

When Dan North first started thinking about BDD back in 2003, the context was not one of replacing TDD with a different set of practices or tools, but rather about how to go about explaining the reasons for, and the underpinning ideas behind TDD itself. As we’ve seen in this contrived example, it seems to make sense to start with tests right at the level of the application. However, as thought around BDD began to mature, and more people started to explore the perspective it offered, so the focus of the tests started to move towards the stakeholders—those for whom the software was being built. We can see this starting to happen in our RSpec example, but it hasn’t fully matured. The main thing missing is how to connect the stories—the self-contained units of work that developers commit to in an agile project—to work that represents real value to the stakeholder. Somehow we need to be able to demonstrate that the code we’re writing, and indeed testing, is applicable to the stories we’ve committed to delivering.

A useful template for capturing the story looks like this:

§ In order to achieve some specific, measurable, definable goal

§ As some kind of stake holder

§ I want a feature

Moving to a BDD way of thinking brings many benefits. It helps to tease out how the software we write should behave, and it serves as an executable specification of what the software should do. This is undeniably a step in the right direction, but BDD-influenced thinkers wanted to take it a little further again. Using RSpec, or Minitest’s spec capabilities, might answer questions about how it should behave, but it doesn’t explain why. We never develop software in a vacuum. We rarely develop software just for fun. There’s always a reason behind it—some kind of driving force behind the project. In order best to understand that, and be sure we’re building the right features, for the right reasons, with the right priority, it’s necessary to engage the stakeholders—the people for whom we’re building the software.

An early attempt to connect these kinds of stories to RSpec was written by Dan North, but greatly improved and released by Aslak Hellesoy in 2008 as Cucumber. Cucumber takes the obvious benefits of test-first programming, and adds to it a whole series of further benefits. In his book, The Cucumber Book, Aslak Hellesoy and Matt Wynne (Pragmatic Bookshelf), Aslak describes Cucumber as somewhat akin to a cheerful and friendly but rather nerdy team member, with a terrifyingly precise recollection of what it is the team is building and why, and who doesn’t mind the grunt work of repeatedly checking that what the team is working on is the right thing, running tests, and reporting back.

The key concept we’re exploring here is that software—and of course infrastructure—begins with an idea. Usually the idea is tied in some way to making something that can be sold, used to reduce cost, improve efficiency, or add enjoyment—whatever it is, there’s almost always an idea—a germ of an idea at the genesis of a software or infrastructure project. The point is that unless the person who has the idea is an incredibly gifted person, it’s unlikely that they’ll be able to build the idea themselves, from scratch, without getting some help. As soon as you introduce help, especially if it’s technical help, you introduce the requirement to communicate. Even in an experienced agile team, with short iterations and a fast feedback cycle, it’s possible to spend a two week period of time working on the wrong thing, delivering something that the developers thought was right, but which somehow got confused, miscommunicated, or misunderstood. Cucumber offers a way to ease the communication and cooperation between people and teams.

At the heart of eXtreme programming is the idea of automated acceptance tests. An acceptance test is simply some code that we can run, which captures at its heart some aspect of the functionality of the system. The idea is that the developer and a stakeholder collaborate on writing this test together to capture requirements in code, which when it passes, forms some kind of seal of approval. These are distinct from the kind of unit tests we looked at previously. Unit tests are largely written by the developer and for the developer. They help emerge and validate design and protect against errors. Acceptance tests are written by the stakeholder and the developer, for the stakeholder and the developer. A commonly used expression is that the difference between unit tests and acceptance tests is that unit tests help you build the thing right, whereas acceptance tests help you build the right thing.

Despite the obvious benefits of automated acceptance tests, in practice even among experienced XP and TDD teams, it’s rarely done, or done well. One of the reasons is that finding a stakeholder with the technical ability, interest, and patience to sit at a computer writing pure Ruby code, even a DSL like RSpec, is incredibly hard. I remember working on an accounts package in PHP and pairing with a product manager, and actually writing SimpleTest acceptance tests. It worked really well, but I’ve never found a stakeholder since who is comfortable with that kind of technical involvement.

Cucumber helps to make automated acceptance testing a reality. If we think about what an acceptance test is, it’s really just an example. We’re saying we need this feature for this purpose. Here are a few examples of how the system would behave if we’d implemented the feature I want. If you can prove to me that these examples do what I’ve asked, then I’ll be happy that the requirement is met. The challenge in making this happen is that in most cases, the areas of expertise of the stakeholder and the developer don’t coincide. Often radically so. This is because each person is an expert in their own domain. I’m an expert at Chef, and a pretty competent Ruby and Python developer. I’m not an expert in social media advertising. The problem Cucumber sets out to solve is that of making it easy to find a shared language—a ubiquitous language—that everyone can use that describes what we’re trying to build and why we’re trying to build it. This language should neither be mired in the jargon of the developer, nor the person who had the idea in the first place.

Beyond making acceptance tests a reality, Cucumber also becomes documentation. Not documentation that slowly decays on a wiki—documentation that is an executable specification, that lives with and shapes the creation of the software. Documentation that can be shared, explored, grown, and that, ultimately, can be run from the command line, and should pass tests. This makes Cucumber potentially a very powerful source of truth and a barometer of health in a project. That’s a pretty awesome state of affairs.

Let’s look at how Cucumber works. At the highest level, it’s just another command-line tool. It reads in plain text files called features, which contain scenarios that describe examples of use cases for the feature. The features and scenarios are written in what is very close to natural language, but with a dozen or so grammar and syntax rules—a DSL called Gherkin. Each scenario is a sequence of steps that need to be carried out in order, setting up state, doing something, and then checking state again. These steps are then mapped onto Ruby code, which takes real action. These are called step definitions. Step definitions typically delegate to support code shipped with the test suite and call out to automation libraries for helper functions for doing things like driving a web browser or using a graphical interface. When Cucumber runs, it executes each step in turn. If all the steps complete successfully, the test is said to have passed, otherwise the user is informed that the test didn’t pass, and the exact state of the test is reported.

I mentioned before that software begins with an idea. Cucumber helps us to capture what the vision behind the idea is. We need to understand what the goal is. The vision might be massive, complex, and exciting. Our task is to work with the visionary to achieve something of value that moves them in the right direction. In recent years, this has started to be called a Minimum Marketable Feature or Minimum Viable Product. Whatever you call it, we’re looking for a description of something achievable that captures and advances the vision and purpose behind the software.

Let’s use Cucumber to write acceptance tests for the HipsterAssessor as a way to explore how the approach works.

Enter the HipsterAssessor project directory and run Cucumber:

$ gem install cucumber $ cucumber You don't have a 'features' directory. Please create one to get started. See http://cukes.info/ for more information.

OK, let’s create a directory for the features, and try again:

$ mkdir features $ cucumber 0 scenarios 0 steps 0m0.000s

This illustrates two important concepts: Cucumber is made from scenarios and steps. Each test that we write represents a scenario that we describe—it tests an aspect of the broader feature that we’re going to help implement. Each scenario contains steps that will tell Cucumber how to actually carry out the test and verify that the intended feature works as specified.

Features are written in a file with a .feature suffix, in the Gherkin language. Open your text editor and create a file called features/assess\_hipster.feature. This is a plain text document, written with a few constraints. The constraints are minimal—the idea is that the feature we write should be in a natural language. In fact, one of the benefits of Gherkin is it supports over 40 languages, so you can write your features in Russian or Welsh if you wish. Actually, this provides a good way to demonstrate how small the DSL is:

$ cucumber --i18 cy-GB | feature | "Arwedd" | | background | "Cefndir" | | scenario | "Scenario" | | scenario_outline | "Scenario Amlinellol" | | examples | "Enghreifftiau" | | given | "* ", "Anrhegedig a " | | when | "* ", "Pryd " | | then | "* ", "Yna " | | and | "* ", "A " | | but | "* ", "Ond " | | given (code) | "Anrhegediga" | | when (code) | "Pryd" | | then (code) | "Yna" | | and (code) | "A" | | but (code) | "Ond" |

That’s the extent of the DSL. Using these keywords, we express our feature. In terms of grammar, the rules are very simple. The file must begin with a feature, followed by a title. This may be followed by an arbitrary number of lines of freeform text to document the feature.

Feature: Assess hipster

In order to make sure developers are comfortable in their workplace As a manager who has just hired a developer I want to be able to assess whether the new developer is a hipster

Hipsters like to have Pabst Blue Ribbon in the fridge, listen to vinyl recordings of Lily Allen, and need a place to store their fixed-wheel bicycles. As a manager I need to be sure I can accommodate hipsters, so I want a simple web app that gives a questionnaire which will advise me if the developer is a hipster.

Scenario: Fixed-wheel bicycle scores 5 points Given a developer with a fixed-wheel bike When I request a hipster assessment Then the score should be 5

Scenario: Spectacles despite 20/20 vision scores 10 points Given a developer with a pair of empty frames When I request a hipster assessment Then the score should be 10

The preceding code block is a full feature, expressed in Gherkin. The idea behind Gherkin is to be able to provide concrete examples that illustrate the required feature. As a language, Gherkin has been optimized for readability and portability. As you can see, Gherkin is pretty much indistinguishable from natural language.

A scenario describes the behavior of the system. Each scenario shares a common pattern. First we set up some state: what is the prerequisite to test the functionality? In this case, since we’re assessing developers, we need a developer. Next we take an action that we anticipate will change some state. In this case, we’re going to ask for a score. Finally, we check the new state and compare it to what we expected. In this case, we expect that the HipsterAssessor will award our developer some points.

The keywords ‘Scenario’, ‘Given’, ‘When’, and ‘Then’ map onto Ruby code called step definitions. If we go ahead and run Cucumber now, we’ll see some progress:

$ cucumber Feature: Assess hipster

In order to make sure developers are comfortable in their workplace As a manager who has just hired a developer I want to be able to assess whether the new developer is a hipster

Hipsters like to have Pabst Blue Ribbon in the fridge, listen to vinyl recordings of Lily Allen, and need a place to store their fixed-wheel bicycles. As a manager I need to be sure I can accommodate hipsters, so I want a simple web app that gives a questionnaire which will advise me if the developer is a hipster.

Scenario: Fixed-wheel bicycle scores 5 points # features/assess_hipster.feature:13 Given a developer with a fixed-wheel bike # features/assess_hipster.feature:14 When I request a hipster assessment # features/assess_hipster.feature:15 Then the score should be 5 # features/assess_hipster.feature:16

Scenario: Spectacles despite 20/20 vision scores 10 points # features/assess_hipster.feature:18 Given a developer with a pair of empty frames # features/assess_hipster.feature:19 When I request a hipster assessment # features/assess_hipster.feature:20 Then the score should be 10 # features/assess_hipster.feature:21

2 scenarios (2 undefined) 6 steps (6 undefined) 0m0.003s

You can implement step definitions for undefined steps with these snippets:

Given(/^a developer with a fixed\-wheel bike$/) do pending # express the regexp above with the code you wish you had end

When(/^I request a hipster assessment$/) do pending # express the regexp above with the code you wish you had end

Then(/^the score should be (\d+)$/) do |arg1| pending # express the regexp above with the code you wish you had end

Given(/^a developer with a pair of empty frames$/) do pending # express the regexp above with the code you wish you had end

If you want snippets in a different programming language, just make sure a file with the appropriate file extension exists where Cucumber looks for step definitions.

Cucumber has generated some code snippets to get us started. Step definitions by convention reside in a step\_definitions directory, under the features directory. Let’s create that directory, and inside there paste the suggested snippets into a file called assess\_hipster\_steps.rb.

Now if we run Cucumber we get a bit further:

$ cucumber Feature: Assess hipster

In order to make sure developers are comfortable in their workplace As a manager who has just hired a developer I want to be able to assess whether the new developer is a hipster

Hipsters like to have Pabst Blue Ribbon in the fridge, listen to vinyl recordings of Lily Allen, and need a place to store their fixed-wheel bicycles. As a manager I need to be sure I can accommodate hipsters, so I want a simple web app that gives a questionnaire which will advise me if the developer is a hipster.

Scenario: Fixed-wheel bicycle scores 5 points # features/assess_hipster.feature:13 Given a developer with a fixed-wheel bike # features/step_definitions/assess_hipster_steps.rb:1 TODO (Cucumber::Pending) ./features/step_definitions/assess_hipster_steps.rb:2:in `/^a developer with a fixed\-wheel bike$/' features/assess_hipster.feature:14:in `Given a developer with a fixed-wheel bike' When I request a hipster assessment # features/step_definitions/assess_hipster_steps.rb:5 Then the score should be 5 # features/step_definitions/assess_hipster_steps.rb:9

Scenario: Spectacles despite 20/20 vision scores 10 points # features/assess_hipster.feature:18 Given a developer with a pair of empty frames # features/step_definitions/assess_hipster_steps.rb:13 TODO (Cucumber::Pending) ./features/step_definitions/assess_hipster_steps.rb:14:in `/^a developer with a pair of empty frames$/' features/assess_hipster.feature:19:in `Given a developer with a pair of empty frames' When I request a hipster assessment # features/step_definitions/assess_hipster_steps.rb:5 Then the score should be 10 # features/step_definitions/assess_hipster_steps.rb:9

2 scenarios (2 pending) 6 steps (4 skipped, 2 pending) 0m0.004s

Cucumber is now calling our step definitions. But our step definitions don’t contain any code that does anything noteworthy, and so Cucumber stops and tells us that the first two steps of each scenario are pending—i.e., unwritten—and therefore it skipped the rest of the test.

Let’s look at the structure of a step within a step definition:

Given /^a developer with a fixed wheel bike$/ do pending # express the regexp above with the code you wish you had end

We’re now in pure Ruby—well, we’re in a pure Ruby DSL. Given is a DSL method that takes a regular expression and a block. The regular expression matches the step in the Gherkin scenario, and the contents of the block specifies what to do when this step is matched. The fact that we’re using regular expressions to match the steps in the Gherkin scenario gives us two very powerful capabilities—we can use capture groups and wildcards. This is just the same as capture groups in sed—you can put parentheses around some text and store what they match in a variable for later use. Wildcards are like a more powerful and flexible form of shell globbing—we can match non-whitespace characters, digits, lowercase letters, or combinations thereof. We’ll see this in action in the score step in a moment. Let’s write the step definitions for real now.

The first is pretty straightforward. We’re going to do the same as we did in the previous steps and instantiate a developer. Take a look at the comment that the automatically generated snippet contains. The key idea here is that we should write the code we wish we had. We don’t have any code at all. We’re simply expressing the interface we’d like to see. Interestingly, when we do it this way, we tend to think with a more design-oriented head. The code we have in the RSpec test is actually a bit ugly:

developer = HipsterAssessor.new(gears_on_bike=1)

Wouldn’t it be nicer to have a method on the assessor that sets the number of gears to a certain value? This would certainly be nicer if we were to think of an interface that we could use with a webform, or some other way to populate the object. Simply calling the constructor with an argument is rather clumsy. Let’s write the code we wish we had:

Given(/^a developer with a fixed\-wheel bike$/) do @developer = HipsterAssessor.new @developer.set(:gears_on_bike, 1) end

Now, let’s fulfill the when step. This is just calling a method:

When(/^I request a hipster assessment$/) do @result = @developer.score.to_s end

We need to convert the score to a string because in our feature the value appears as a string, not an integer.

Now we come to the then step. Here we can see the power of the regular expression. Cucumber has already suggested we might be interested in the score and has suggested a capture group and wildcard. The value of this will be passed into the block as arg1. We should change that to something more readable.

Then /^the score should be (\d+)$/ do |score| expect(@result).to eq score end

While we’re at it, let’s add the developer with empty frames given, and then we can run the whole feature.

Given(/^a developer with a pair of empty frames$/) do @developer = HipsterAssessor.new @developer.set(:glasses_prescription, nil) end

We should probably move this up, so it reads nicely, too. At this stage our steps look like this:

Given(/^a developer with a fixed\-wheel bike$/) do @developer = HipsterAssessor.new @developer.set(:gears_on_bike, 1) end

Given(/^a developer with a pair of empty frames$/) do @developer = HipsterAssessor.new @developer.set(:glasses_prescription, nil) end

When(/^I request a hipster assessment$/) do @result = @developer.score.to_s end

Then /^the score should be (\d+)$/ do |score| expect(@result).to eq score end

OK…what happens when we run Cucumber?

$ cucumber Feature: Assess hipster

In order to make sure developers are comfortable in their workplace As a manager who has just hired a developer I want to be able to assess whether the new developer is a hipster

Hipsters like to have Pabst Blue Ribbon in the fridge, listen to vinyl recordings of Lily Allen, and need a place to store their fixed-wheel bicycles. As a manager I need to be sure I can accommodate hipsters, so I want a simple web app that gives a questionnaire which will advise me if the developer is a hipster.

Scenario: Fixed-wheel bicycle scores 5 points # features/assess_hipster.feature:13 Given a developer with a fixed-wheel bike # features/step_definitions/assess_hipster_steps.rb:1 uninitialized constant HipsterAssessor (NameError) ./features/step_definitions/assess_hipster_steps.rb:2:in `/^a developer with a fixed\-wheel bike$/' features/assess_hipster.feature:14:in `Given a developer with a fixed-wheel bike' When I request a hipster assessment # features/step_definitions/assess_hipster_steps.rb:6 Then the score should be 5 # features/step_definitions/assess_hipster_steps.rb:10

Scenario: Spectacles despite 20/20 vision scores 10 points # features/assess_hipster.feature:18 Given a developer with a pair of empty frames # features/step_definitions/assess_hipster_steps.rb:14 undefined method `set' for nil:NilClass (NoMethodError) ./features/step_definitions/assess_hipster_steps.rb:15:in `/^a developer with a pair of empty frames$/' features/assess_hipster.feature:19:in `Given a developer with a pair of empty frames' When I request a hipster assessment # features/step_definitions/assess_hipster_steps.rb:6 Then the score should be 10 # features/step_definitions/assess_hipster_steps.rb:10

Failing Scenarios: cucumber features/assess_hipster.feature:13 # Scenario: Fixed-wheel bicycle scores 5 points cucumber features/assess_hipster.feature:18 # Scenario: Spectacles despite 20/20 vision scores 10 points

2 scenarios (2 failed) 6 steps (2 failed, 4 skipped) 0m0.004s

OK, this is familiar—we have a failing test! We haven’t connected the test to our code. Let’s do that by adding the require_relative to the top of the steps:

require_relative '../../hipsterassessor'

Now the test runs, and the relevant part of the output is:

Scenario: Fixed-wheel bicycle scores 5 points # features/assess_hipster.feature:13 Given a developer with a fixed-wheel bike # features/step_definitions/assess_hipster_steps.rb:3 wrong number of arguments (0 for 1) (ArgumentError) ./hipsterassessor.rb:3:in `initialize' ./features/step_definitions/assess_hipster_steps.rb:4:in `new' ./features/step_definitions/assess_hipster_steps.rb:4:in `/^a developer with a fixed\-wheel bike$/' features/assess_hipster.feature:14:in `Given a developer with a fixed-wheel bike' When I request a hipster assessment # features/step_definitions/assess_hipster_steps.rb:12 Then the score should be 5 # features/step_definitions/assess_hipster_steps.rb:16

Now, here we’re working slightly outside the standard pattern I would recommend because I chose to introduce testing from the unit tests out. We’ve actually already got code, which we’re calling, that we need to change. Let’s follow through and see what happens. So at the moment, our test code is calling the constructor with no arguments:

Given(/^a developer with a fixed\-wheel bike$/) do @developer = HipsterAssessor.new @developer.set(:gears_on_bike, 1) end

But in the actual class, we specify that the constructor took an argument. Let’s remove that:

def initialize @gears = bike_gears @score = 0 end

While we’re there, our constructor shouldn’t try to set gears any more either, so let’s remove that line:

def initialize @score = 0 end

Running Cucumber now yields the following:

$ cucumber features/assess_hipster.feature:13 Feature: Assess hipster

In order to make sure developers are comfortable in their workplace As a manager who has just hired a developer I want to be able to assess whether the new developer is a hipster

Hipsters like to have Pabst Blue Ribbon in the fridge, listen to vinyl recordings of Lily Allen, and need a place to store their fixed-wheel bicycles. As a manager I need to be sure I can accommodate hipsters, so I want a simple web app that gives a questionnaire which will advise me if the developer is a hipster.

Scenario: Fixed-wheel bicycle scores 5 points # features/assess_hipster.feature:13 Given a developer with a fixed-wheel bike # features/step_definitions/assess_hipster_steps.rb:3 undefined method `set' for #<HipsterAssessor:0x00000002aaf648 @score=0> (NoMethodError) ./features/step_definitions/assess_hipster_steps.rb:5:in `/^a developer with a fixed\-wheel bike$/' features/assess_hipster.feature:14:in `Given a developer with a fixed-wheel bike' When I request a hipster assessment # features/step_definitions/assess_hipster_steps.rb:12 Then the score should be 5 # features/step_definitions/assess_hipster_steps.rb:16

Failing Scenarios: cucumber features/assess_hipster.feature:13 # Scenario: Fixed-wheel bicycle scores 5 points

1 scenario (1 failed) 3 steps (1 failed, 2 skipped) 0m0.002s

We need a set method. At this point, we should drop down a level to RSpec or Minitest and write a test for the set method.

First, let’s run the test and see what breaks:

$ ruby test_hipster.rb Run options: --seed 59310

# Running tests:

EEE

Finished tests in 0.000963s, 3113.7225 tests/s, 0.0000 assertions/s.

1) Error: HipsterAssessor::when assessing whether a developer is a hipster#test_0001_can establish if the developer has a fixed-wheel bicycle: ArgumentError: wrong number of arguments (1 for 0) /home/tdi/tdd-principles/hipsterassessor.rb:3:in `initialize' test_hipster.rb:8:in `new' test_hipster.rb:8:in `block (2 levels) in <main>'

2) Error: HipsterAssessor::when assessing whether a developer is a hipster#test_0002_can report a hipster assessment score: ArgumentError: wrong number of arguments (1 for 0) /home/tdi/tdd-principles/hipsterassessor.rb:3:in `initialize' test_hipster.rb:8:in `new' test_hipster.rb:8:in `block (2 levels) in <main>'

3) Error: HipsterAssessor::when assessing whether a developer is a hipster#test_0003_can award five points for having a fixie: ArgumentError: wrong number of arguments (1 for 0) /home/tdi/tdd-principles/hipsterassessor.rb:3:in `initialize' test_hipster.rb:8:in `new' test_hipster.rb:8:in `block (2 levels) in <main>'

3 tests, 0 assertions, 0 failures, 3 errors, 0 skips

Unsurprisingly, everything breaks because we’re calling the constructor differently. Thankfully that’s trivial to fix in our Minitest test; we only instantiate the assessor in one place. Change that to:

before do @developer = HipsterAssessor.new end

Now running the test returns failures not errors:

$ ruby test_hipster.rb Run options: --seed 21627

# Running tests:

FF.

Finished tests in 0.040021s, 74.9611 tests/s, 74.9611 assertions/s.

1) Failure: HipsterAssessor::when assessing whether a developer is a hipster#test_0001_can establish if the developer has a fixed-wheel bicycle [test_hipster.rb:14]: Expected: true Actual: false

2) Failure: HipsterAssessor::when assessing whether a developer is a hipster#test_0003_can award five points for having a fixie [test_hipster.rb:22]: Expected: 5 Actual: 0

3 tests, 3 assertions, 2 failures, 0 errors, 0 skips

We need also to write a test for the set method, and then turn to fixing the remaining tests. As we think about it, we realize we need a get method, too, and this is needed for the test:

it "can set a hipster credential to a given value" do @developer.set(:favorite_beer, "PBR") @developer.get(:favorite_beer).must_equal "PBR" end

Let’s run the test:

$ ruby test_hipster.rb Run options: --seed 44375

# Running tests:

E.FF

Finished tests in 0.018478s, 216.4764 tests/s, 162.3573 assertions/s.

1) Error: HipsterAssessor::when assessing whether a developer is a hipster#test_0004_can set a hipster credential to a given value: NoMethodError: undefined method `set' for #<HipsterAssessor:0x00000000efe890 @score=0> test_hipster.rb:26:in `block (3 levels) in <main>'

2) Failure: HipsterAssessor::when assessing whether a developer is a hipster#test_0003_can award five points for having a fixie [test_hipster.rb:22]: Expected: 5 Actual: 0

3) Failure: HipsterAssessor::when assessing whether a developer is a hipster#test_0001_can establish if the developer has a fixed-wheel bicycle [test_hipster.rb:14]: Expected: true Actual: false

4 tests, 3 assertions, 2 failures, 1 errors, 0 skips

Right, let’s implement the set method. Add a hipster\_credentials hash to the constructor, and then the set method:

def initialize @score = 0 @hipster_credentials = {} end

def set(key, value) @hipster_credentials[key] = value end

Running the tests now reveals that we need a get method. We already exercise this in the test, so let’s write the method for that:

def get(key) @hipster_credentials[key] end

Now run the tests:

$ ruby test_hipster.rb Run options: --seed 32953

# Running tests:

F..F

Finished tests in 0.018647s, 214.5123 tests/s, 214.5123 assertions/s.

1) Failure: HipsterAssessor::when assessing whether a developer is a hipster#test_0003_can award five points for having a fixie [test_hipster.rb:22]: Expected: 5 Actual: 0

2) Failure: HipsterAssessor::when assessing whether a developer is a hipster#test_0001_can establish if the developer has a fixed-wheel bicycle [test_hipster.rb:14]: Expected: true Actual: false

4 tests, 4 assertions, 2 failures, 0 errors, 0 skips

OK, now our get and set methods work. We still have other failing tests though, which we need to make pass. When we set up state in the test we need to use the HipsterAssessor#get and HipsterAssessor#set, for those cases where a fixed-wheel bicycle is mentioned. We also need to make the has\_fixie method use the hipster\_credentials hash.

In our test, we make the updates:

it "can establish if the developer has a fixed-wheel bicycle" do @developer.set(:gears_on_bike, 1) @developer.has_fixie?.must_equal true end

it "can award five points for having a fixie" do @developer.set(:gears_on_bike, 1) @developer.score.must_equal 5 end

And in the class:

def has_fixie? @hipster_credentials[:gears_on_bike] == 1 end

Now all the tests pass!

$ ruby test_hipster.rb Run options: --seed 44033

# Running tests:

....

Finished tests in 0.000843s, 4744.1706 tests/s, 4744.1706 assertions/s.

4 tests, 4 assertions, 0 failures, 0 errors, 0 skips

Let’s quickly summarize the state of the test and the class. Here’s the test:

require 'minitest/autorun' require_relative 'hipsterassessor'

describe HipsterAssessor do

before do @developer = HipsterAssessor.new end

describe "when assessing whether a developer is a hipster" do

it "can establish if the developer has a fixed-wheel bicycle" do @developer.set(:gears_on_bike, 1) @developer.has_fixie?.must_equal true end

it "can report a hipster assessment score" do @developer.score.must_be_instance_of Fixnum end

it "can award five points for having a fixie" do @developer.set(:gears_on_bike, 1) @developer.score.must_equal 5 end

it "can set a hispter credential to a given value" do @developer.set(:favorite_beer, "PBR") @developer.get(:favorite_beer).must_equal "PBR" end

end

end

And here’s the class:

class HipsterAssessor

def initialize @score = 0 @hipster_credentials = {} end

def set(key, value) @hipster_credentials[key] = value end

def get(key) @hipster_credentials[key] end

def has_fixie? @hipster_credentials[:gears_on_bike] == 1 end

def score if self.has_fixie? @score = @score + 5 end @score end end

Now that the tests pass, we can go back out to Cucumber.

Running Cucumber now shows the first scenario passing! The second doesn’t pass:

$ cucumber Feature: Assess hipster

In order to make sure developers are comfortable in their workplace As a manager who has just hired a developer I want to be able to assess whether the new developer is a hipster

Hipsters like to have Pabst Blue Ribbon in the fridge, listen to vinyl recordings of Lily Allen, and need a place to store their fixed-wheel bicycles. As a manager I need to be sure I can accommodate hipsters, so I want a simple web app that gives a questionnaire which will advise me if the developer is a hipster.

Scenario: Fixed-wheel bicycle scores 5 points # features/assess_hipster.feature:13 Given a developer with a fixed-wheel bike # features/step_definitions/assess_hipster_steps.rb:3 When I request a hipster assessment # features/step_definitions/assess_hipster_steps.rb:13 Then the score should be 5 # features/step_definitions/assess_hipster_steps.rb:17

Scenario: Spectacles despite 20/20 vision scores 10 points # features/assess_hipster.feature:18 Given a developer with a pair of empty frames # features/step_definitions/assess_hipster_steps.rb:8 When I request a hipster assessment # features/step_definitions/assess_hipster_steps.rb:13 Then the score should be 10 # features/step_definitions/assess_hipster_steps.rb:17

expected: "10" got: "0"

(compared using ==) (RSpec::Expectations::ExpectationNotMetError) ./features/step_definitions/assess_hipster_steps.rb:18:in `/^the score should be (\d+)$/' features/assess_hipster.feature:21:in `Then the score should be 10'

Failing Scenarios: cucumber features/assess_hipster.feature:18 # Scenario: Spectacles despite 20/20 vision scores 10 points

2 scenarios (1 failed, 1 passed) 6 steps (1 failed, 5 passed) 0m0.004s

This requires us to go back to the lower level, and write a test for applying a value on the basis of phoney spectacles. Let’s add that test:

it "can award ten points for phoney spectacles" do @developer.set(:glasses_prescription, nil) @developer.score.must_equal 10 end

Watch it fail:

$ ruby test_hipster.rb Run options: --seed 36835

# Running tests:

..F..

Finished tests in 0.018249s, 273.9899 tests/s, 273.9899 assertions/s.

1) Failure: HipsterAssessor::when assessing whether a developer is a hipster#test_0004_can award ten points for phoney spectacles [test_hipster.rb:29]: Expected: 10 Actual: 0

5 tests, 5 assertions, 1 failures, 0 errors, 0 skips

Now we realize that we should have a test that the assessor can use to establish if the developer has phoney specs. Let’s add that, too:

it "can establish if the developer has phoney spectacles" do @developer.set(:glasses_prescription, nil) @developer.has_phoney_specs?.must_equal true end

Run the tests, watch it fail:

$ ruby test_hipster.rb Run options: --seed 36835

# Running tests:

..F..

Finished tests in 0.018249s, 273.9899 tests/s, 273.9899 assertions/s.

1) Failure: HipsterAssessor::when assessing whether a developer is a hipster#test_0004_can award ten points for phoney spectacles [test_hipster.rb:29]: Expected: 10 Actual: 0

5 tests, 5 assertions, 1 failures, 0 errors, 0 skips tdi@tk00:~/tdd-principles$ vi test_hipster.rb tdi@tk00:~/tdd-principles$ ruby test_hipster.rb Run options: --seed 7359

# Running tests:

.F..E.

Finished tests in 0.018551s, 323.4269 tests/s, 269.5224 assertions/s.

1) Failure: HipsterAssessor::when assessing whether a developer is a hipster#test_0005_can award ten points for phoney spectacles [test_hipster.rb:34]: Expected: 10 Actual: 0

2) Error: HipsterAssessor::when assessing whether a developer is a hipster#test_0002_can establish if the developer has phoney spectacles: NoMethodError: undefined method `has_phoney_specs?' for #<HipsterAssessor:0x0000000136db38> test_hipster.rb:20:in `block (3 levels) in <main>'

6 tests, 5 assertions, 1 failures, 1 errors, 0 skips

Now add the method which checks for the specs, and then update the score method to return 10 in the case of phoney specs:

def has_phoney_specs? @hipster_credentials[:gears_on_bike] == nil end

def score if self.has_fixie? @score = @score + 5 elsif self.has_phoney_specs? @score = @score + 10 end @score end

Once more with feeling!

ruby test_hipster.rb Run options: --seed 15341

# Running tests:

......

Finished tests in 0.000912s, 6581.1700 tests/s, 6581.1700 assertions/s.

6 tests, 6 assertions, 0 failures, 0 errors, 0 skips

And with Cucumber?

$ cucumber Feature: Assess hipster

In order to make sure developers are comfortable in their workplace As a manager who has just hired a developer I want to be able to assess whether the new developer is a hipster

Hipsters like to have Pabst Blue Ribbon in the fridge, listen to vinyl recordings of Lily Allen, and need a place to store their fixed-wheel bicycles. As a manager I need to be sure I can accommodate hipsters, so I want a simple web app that gives a questionnaire which will advise me if the developer is a hipster.

Scenario: Fixed-wheel bicycle scores 5 points # features/assess_hipster.feature:13 Given a developer with a fixed-wheel bike # features/step_definitions/assess_hipster_steps.rb:3 When I request a hipster assessment # features/step_definitions/assess_hipster_steps.rb:13 Then the score should be 5 # features/step_definitions/assess_hipster_steps.rb:17

Scenario: Spectacles despite 20/20 vision scores 10 points # features/assess_hipster.feature:18 Given a developer with a pair of empty frames # features/step_definitions/assess_hipster_steps.rb:8 When I request a hipster assessment # features/step_definitions/assess_hipster_steps.rb:13 Then the score should be 10 # features/step_definitions/assess_hipster_steps.rb:17

2 scenarios (2 passed) 6 steps (6 passed) 0m0.004s

So, at the end of that whistlestop tour of testing in Ruby, you should now feel confident that you understand the rationale, toolchain, and workflow of test- and behavior-driven development. Let’s now move on to discuss how to go about implementing some of these ideas with respect to infrastructure coding.