Automating acceptance criteria for non-UI requirements - How do I build it? Coding the BDD way - BDD in Action: Behavior-Driven Development for the whole software lifecycle (2015)

BDD in Action: Behavior-Driven Development for the whole software lifecycle (2015)

Part 3. How do I build it? Coding the BDD way

Chapter 9. Automating acceptance criteria for non-UI requirements

This chapter covers

· Balancing UI and non-UI acceptance criteria

· Where to use non-UI acceptance tests

· Automating acceptance tests for the controller layer of a web application

· Automating acceptance tests that test application code directly

· Automating acceptance tests for remote services

· Automating acceptance tests for nonfunctional requirements

· Discovering application design using non-UI acceptance tests

Although they have their uses, web tests shouldn’t be the only tool in your automated acceptance testing toolbox. It’s important to know when to use them and when to look for alternative strategies. In this chapter, you’ll learn about other ways to automate your acceptance tests that don’t involve exercising the user interface (see figure 9.1).

Figure 9.1. In this chapter we’ll focus on automating acceptance tests that exercise the non-UI components of your application.

In chapter 8, you learned how to automate acceptance tests using automated web testing tools such as Selenium 2/WebDriver. Web tests are a great way to simulate the user’s journey through the system, to illustrate interactions with an application, and to document the key features of your application. Stakeholders can easily relate to web tests, as they’re highly visual, intuitive, and map closely to the user experience. Automated web tests can also be used to demonstrate new features, which is a great way to increase confidence in the tests. Automated web tests are also the only way to effectively test business logic that’s implemented directly within the user interface (UI).

But you also saw that end-to-end web tests tend to execute significantly more slowly than tests that don’t involve the UI. Interacting with a browser adds significant overhead in terms of execution time and system resources. Web tests that rely on a real browser are also more subject to technical or environment-related issues that are hard to control. For example, web tests can fail because the wrong version of a browser is installed on a test machine, or because the browser crashes. Tests that fail for reasons unrelated to the application logic waste development time and resources, and can reduce the team’s confidence in the test suite.

Although they can have great value, web tests shouldn’t be your only option when it comes to automating your BDD scenarios:

· Most applications need a judicious mix of both automated UI tests and automated tests for non-UI components.

· Non-UI tests can work at different levels of the application, including tests for the controller layer of an MVC application, tests that exercise the business rules implemented in the application code directly, and tests that work with remote services.

· Non-UI tests can also be used to verify nonfunctional requirements, such as performance.

· Implementing non-UI acceptance tests, and the corresponding application code, is also a great way to discover what components your application needs and to design clean, effective APIs within your application.

Let’s start with how you should find the correct balance between UI and non-UI acceptance tests.

9.1. Balancing UI and non-UI acceptance tests

The ideal balance between UI and non-UI tests will naturally vary from project to project, depending on the nature of the application being built, the features being developed, and the technologies used. For some requirements, web or UI tests will be a natural fit, but for others, non-UI-based testing is more appropriate. Still other tests may need a mixture of both approaches.

Some web applications choose to keep business logic on the client side to a minimum, with the bulk of the business logic being performed on the server side in a well-defined service layer. This can be a deliberate design decision or a consequence of the chosen technology stack. This approach is illustrated in figure 9.2.

Figure 9.2. When an application has little business logic in the UI, much of the business logic can be tested directly against the service layer.

In this sort of application, automated web tests will typically be used to illustrate and verify the user’s journey through the application, to ensure that data is submitted to the server correctly, to check that form validation messages are displayed correctly, to see that results from the server are rendered accurately, and so on. The web tests also act as end-to-end tests, verifying the flow of information through the whole system. If the application is designed well, the business logic will be localized in the form of a well-defined service API that can be tested effectively using non-web tests.

If, on the other hand, the application wasn’t designed with a clean service layer, business logic may be scattered through several layers of the system, and there may be no easy way to isolate particular operations. This is often the case for legacy applications, or it may be the result of poor architectural choices; in either case, pragmatic teams may use web testing to verify both the UI behavior and the business logic.

It’s increasingly rare these days to see an application without any client-side behavior. Modern applications need to provide rich, interactive experiences that can’t be achieved with older approaches. This can only be done by including at least some business logic and behavior within the UI layer. Modern JavaScript-based frameworks such as AngularJS and Backbone take this to the point where the client layer can be considered an application in its own right, calling remote web services for tasks that it can’t perform locally or to retrieve the data it needs to function (see figure 9.3). Mobile apps often use a similar architecture, with a significant amount of business logic residing within the client application.

Figure 9.3. Modern single-page apps need more comprehensive tests of the business logic in both the UI and web service layer.

In both these cases, the web services used by the rich client application form a clean API that can be effectively tested using non-UI testing, whereas the screen flow and behavioral logic of the UI can be effectively tested using web tests.

In the rest of the chapter, we’ll look at a few strategies that can be applied in these different scenarios.

9.2. When to use non-UI acceptance tests

The business logic or business rules of an application describe how the application is expected to deliver business value, as well as the constraints under which it will operate. Business logic looks beyond how a user interacts with the application and focuses on what the user expects to achieve. The UI and user experience play an essential role in helping users get the most out of an application, but if you aren’t clear on the underlying business rules and constraints, even the best-designed UIs will be wasted on features that are of limited practical value to your end users.

Business logic includes things like these:

· What are the expected outcomes, in business terms, of a particular user action?

· How does your application or business differentiate itself from its competitors? What makes it better than the legacy application you’re replacing?

· What business rules need to be applied?

· What business-related constraints apply to user actions?

For example, suppose you’re promoting a special Flying High–endorsed credit card that members can use to earn Frequent Flyer points when they make purchases. Because this is a web-based application with a rich user experience, web tests will play a critical role in your testing strategy. You’ll need to verify the user’s journey through the screens to apply for a new credit card, check on how successful and failing applications are displayed, ensure appropriate legal texts are agreed to, and so forth.

For example, an automated web test might be a good way to verify how the Frequent Flyer home page encourages members to apply for the new credit card:

Scenario: The Frequent Flyers site encourages members to apply for the new credit card

Given Joe is a Flying High Frequent Flyer

When Joe views his account home page

Then he should be able to apply for a Flying High Credit Card

Cucumber Examples

The approaches discussed here are relatively implementation-neutral and tool-agnostic, but I did need to choose a format for the sample code. In the previous chapter we discussed examples built using JBehave and Thucydides, so in this chapter you’ll use Cucumber.

Other examples of useful web-based scenarios might describe the user’s journey through the application process:

Scenario: Joe is eligible for a Flying High Credit Card

Given Joe is a Flying High Frequent Flyer eligible for automatic Credit Card


When Joe applies for a Flying High Credit Card

Then Joe should be informed that his application was successful

And Joe should receive a confirmation email

And Joe's application should be queued for approval

Scenario: Joe is not eligible for a Flying High Credit Card

Given Joe is a Flying High Frequent Flyer who is not eligible for automatic

Credit Card approval

When Joe applies for a Flying High Credit Card

Then Joe should be informed that Flying High will be in touch

And Joe's application should be queued for manual processing

Both of these scenarios illustrate key paths through the application and focus on how the user interacts with the website. You could implement them using the approaches we discussed in chapter 8.

However, because these acceptance criteria focus on the user experience and high-level outcomes, they deliberately gloss over some important underlying business logic. In particular, how do you determine if Joe is eligible or not? And depending on the scope of the feature, you may also need to explore other questions. How will the feature attract new Frequent Flyer members, and how will it earn revenue for the organization? This may involve some complex business rules. If you were to write automated web tests for each of these rules, it would slow down and add complexity to the test suite without adding a great deal of reporting value.

For example, income is an important consideration. The Flying High sales manager explains that regular Frequent Flyers with an annual income of over $120,000 will be approved automatically, whereas those with an income less than $50,000 will be declined. Those in between will need to be processed manually:

Scenario Outline: Credit card eligibility based on income

Given Joe is a regular Frequent Flyer earning <income>

When Joe applies for a Flying High Credit Card

Then his application should be <result>


income | result | notes

120000 | automatic | Income >= 120000

100000 | manual |

49999 | declined | Income < 50000

But Frequent Flyer status also influences eligibility. Depending on a member’s status, their application may be accepted automatically, even with a lower income:

Scenario Outline: Credit card eligibility based on income and status

Given Joe is a <status> Frequent Flyer earning <income>

When Joe applies for a Flying High Credit Card

Then his application should be <result>


status | income | result | notes

gold | 80000 | automatic | Automatically approved over $80000

gold | 79999 | manual |

gold | 49999 | declined |

silver | 100000 | automatic | Automatically approved over $100000

silver | 99999 | manual |

bronze | 110000 | automatic | Automatically approved over $110000

bronze | 109999 | manual |

As you can see here, table-based scenarios are a great way to explore business rules, and they can be used to discuss and illustrate both positive and negative scenarios.

Note that none of these examples rely on a particular UI. If these rules are only ever calculated on the server side, they could be tested without having to manipulate the UI at all. This is often the case with acceptance criteria around business rules.

Exercise 9.1

The credit card eligibility table in the previous example is incomplete. For example, it doesn’t explore examples of eligibility that depend on age or job history. Complete the preceding scenario with additional business rules until you think you have enough to implement a solution.

Discovering examples through conversation

As you’ve seen, conversation is the cornerstone of BDD, but not all conversations are equal. BDD conversations are more valuable when they’re discovering new examples and expanding the team’s collective understanding. In practice, the time a team has to work through and clarify acceptance criteria isn’t infinite, and this process monopolizes the time and attention of several team members. It’s important to focus first on high-value business rules and on areas of uncertainty, and then on features that are better known and less risky.

For example, suppose you’re working on a story card related to authentication. If your application uses an existing, well-known authentication mechanism that the team has already worked with, the acceptance criteria for the related business rules can afford to be succinct. You might have a basic scenario such as the following:

Given Joe has a valid LDAP account

And Joe has application permissions for application X

When Joe logs on

Then Joe should be given access to the application

To this you might add a few counterexamples: invalid username or password, insufficient permissions, and so forth. But if this is familiar territory for the team, there’s little benefit spending a lot of time working through these examples together, and the developers may be able to implement the feature using only a small set of examples.

On the other hand, if the application uses a new authentication strategy or technology, it will be valuable to spend more time working through the unknowns. Where are user accounts stored? Will there be single-sign-on between this application and other applications within the organization? What application-specific permissions need to be managed, and who will manage them? And so on.

One of the big benefits of BDD comes from the way conversation can be used to reveal assumptions and flush out uncertainty, expanding your collective understanding of a problem domain. Naturally, this process is more beneficial where there are unknowns to be flushed out, and the law of diminishing returns is very applicable. As a rule of thumb, if you’re in a Three Amigos session and the scenarios start to feel like you’re “stating the obvious,” it’s probably time to move on to another requirement.

In fact, although you’ll need to have UI-based acceptance criteria that illustrate how users will interact with the application, and to demonstrate and verify any business logic that’s implemented within the UI layer itself, acceptance criteria that relate to business rules implemented on the server may not need to exercise the UI at all. In the rest of this chapter we’ll discuss ways to effectively automate acceptance criteria like these without the need for automated UI tests.

9.3. Types of non-UI automated acceptance tests

As you’ve seen, UI tests have their place when it comes to testing and illustrating user interactions with the system and the user experience as a whole, but it’s often inefficient to use them for more detailed business rules. There are a number of strategies that can be used to bypass heavyweight UI tests when automating your acceptance criteria, and many tools can be used to implement these approaches. Some of the more commonly used approaches include

· Testing against the controller-layer application code

· Testing directly against business logic in the application code

· Testing services remotely, such as by invoking web services and thus avoiding the UI layer entirely

Figure 9.4 illustrates these strategies in use with a typical modern web application.

Figure 9.4. Applications need different types of tests for different parts of the system.

In the following sections, we’ll look at each of these options in more detail.

9.3.1. Testing against the controller layer

In some cases, you may be able to implement workflow and validation acceptance criteria without touching the web interface at all. This is particularly true if your application uses one of the many variations of the Model-View-Controller (MVC) architecture. MVC is a widely used architecture pattern that aims at cleanly separating the data an application uses from the way it’s presented to the user (see figure 9.5).

Figure 9.5. The MVC architecture pattern aims at cleanly separating application data from the way it’s presented.

If your application uses a variation of the MVC architecture, you may be able to write some of your acceptance tests directly against the controller layer. This is sometimes referred to as testing “under the skin” of the application. When you automate acceptance tests for a web application at the controller level, for example, you simulate requests arriving at a controller, and then check which page the controller displays and what data the controller places in the HTTP session or request scope.

In the Java world, Spring is an excellent and popular application development framework that provides first-class support for testing the controller layer. The web application module, Spring MVC, makes it easy to both write and test controllers using Java annotations to handle web requests.[1]

1 For more information, see section 11.3.6, “Spring MVC Test Framework,” of the Spring Framework Reference Documentation:

Suppose you’re writing a Spring MVC application to monitor real-time flights for Flying High. You need a page that displays the current flight status for a particular flight. The corresponding acceptance criteria might look like this:

Feature: Displaying flight status

Scenario: Provide a positive visual queue for on-time flights

Given that flight FH-101 has no reported delays

When I check the flight status

Then I should see that it is on time

And I should see its scheduled arrival time

In a Spring MVC application, the view is typically a JSP page template that renders an HTML page containing data from the model. The controller receives queries from the user, prepares the model data to be displayed, and decides what page to render. For the flight status screen, the controller might receive the fight number as a parameter, retrieve the current flight status from the model (most likely via a service layer), and prepare the flight status screen with the corresponding status information.

Cucumber-JVM provides excellent integration with Spring and Spring MVC. You could automate this scenario using Cucumber in Java with the following code:

In this test you use Spring to configure the components and services it needs to retrieve the flight status and set up the test data using an existing service class . Next, you invoke the flight status controller with the flightId parameter and check that the controller retrieved the expected values and and that you’re redirected to the right screen .

Controller-level testing isn’t the exclusive domain of server-side development: it can apply equally to the client side. Let’s look at an example. AngularJS ( is a popular and elegant JavaScript MVC framework used to build well-designed and easily maintainable single-page applications in JavaScript. It can be compared to other JavaScript MVC frameworks such as Backbone.js and Ember.js. As you’ll see, one of the areas in which AngularJS shines is ease of testing. Let’s see what a controller-level test would look like in AngularJS.

Suppose you’re writing an AngularJS app to monitor real-time flights for Flying High, using the requirements we discussed previously. To do this, you might write an AngularJS controller that talks to a service layer whose job is to provide information about the status of a given flight.

AngularJS provides tight integration with the Jasmine JavaScript testing library (, so for simplicity you’ll implement this scenario in Jasmine.[2] A basic version of an AngularJS test for this scenario that calls the controller directly might look like this:

2 For tighter integration with the scenario text, you could also use Cucumber-JS or Yabba.

In this test, you simulate an HTTP request that passes the flight ID, FH-101 , to your AngularJS controller. You then call the controller with this parameter and check what data the controller has placed into the page scope . The controller uses the flightService object, which AngularJS sets up for you . This makes this test an integration test: you’re assuming that a test database has been configured to return the right status for flight FH-101.

If you wanted to have more direct control over the flight status, you could easily replace the flightService object with your own stub:

flightService = {

getStatus: function(flightNumber) {



case 'FH-101': return 'ontime';

case 'FH-102': return 'delayed';




Both would be valid strategies: the first focuses on ensuring that all the components are wired together correctly, whereas the second is more interested in exploring different scenarios.

Testing against the controller layer is certainly a lot faster than web testing, but it does have some limitations. For more involved scenarios, the code can be complicated to set up and hard to maintain compared to an equivalent web test. When you check the routing instructions or error messages produced by a controller component, you have no way of knowing if this routing goes to the correct web page or whether the error messages are rendered correctly. You may need to complement your controller-level tests with a few UI-based tests that illustrate these more visual aspects.

This approach is also closer to unit or integration testing than the end-to-end web tests, which may inspire less confidence from nontechnical team members. Compared to web tests, where they can see visual feedback about the application’s behavior, stakeholders will have a much harder time understanding (and therefore trusting) the results of controller-layer tests. When deciding how to implement your acceptance criteria, there’s often a balance to be struck between confidence in the tests and speed of execution.

9.3.2. Testing business logic directly

Some application features and some kinds of business logic are relatively independent of the UI. The credit card eligibility rule we discussed earlier is a good example of this:

Scenario Outline: Credit card eligibility based on income and status

Given Joe is a <status> Frequent Flyer earning <income>

When Joe applies for a Flying High Credit Card

Then his application should be <outcome>


status | income | outcome | notes

gold | 80000 | automatic | Automatically approved for >= $80000

gold | 79999 | manual |

gold | 49999 | declined |

silver | 100000 | automatic | Automatically approved for >= $100000

silver | 99999 | manual |

bronze | 110000 | automatic | Automatically approved for >= $110000

bronze | 109999 | manual |

You might implement this feature as illustrated in figure 9.6. Joe applies for a Frequent Flyer credit card online using an AngularJS application , which submits his details to a web service to determine whether he is eligible or not . The web service in turn calls a business service component that does the actual eligibility calculation and returns the result to the web page to be displayed.

Figure 9.6. The Flying High application invokes a business process through a web service to evaluate credit card applications in real time.

In this case, you need to describe and verify the behavior of the AngularJS application, such as by using the web testing techniques discussed in chapter 8. You also need to document and verify the behavior of the web service, as we discussed in the previous section. But the core business logic around the eligibility calculation is neither in the UI nor in the web service: it’s in the Credit Card Application Calculator module. There are many different scenarios that need to be verified, and it’s potentially inefficient to test them all through the UI.

A more effective strategy might be to test the UI for each possible eligibility outcome (presuming they’re displayed differently on the screen), and test the Credit Card Application Calculator directly to verify that the various status and income combinations produce the expected outcomes. Instead of eight UI tests, you’d have three (one for each eligibility outcome), plus eight tests written directly against the application code.

Other features may operate quite independently of the UI, and it would be difficult or impractical to test them via the UI. Batch processes, file or transaction processing, and back-end services are good examples of this type of scenario. In these cases, automating the acceptance criteria by exercising the application code directly is a viable option.

Let’s look at another example. Suppose the Flying High flight-tracking system needs to process baggage registrations coming from the legacy baggage-handling system. This system provides the registrations in a well-defined text-based message format. You need to extract data from these baggage registrations and inject them into your system, routing them to different subsystems depending on whether the baggage is booked on a domestic or international flight and whether the flight is a direct one or involves transfers (see figure 9.7).

Figure 9.7. The Flying High baggage-routing system takes legacy messages and routes them to different subsystems.

One way you could do this is to embed the actual content of the message in the feature file. One such scenario, implemented using Cucumber in Java, might look like this:

Alternatively, you could use a set of named, well-defined message files. This approach is similar to the personas we looked at in section 7.2.5; it allows you to make the tests more concise by hiding the message details behind a well-known identifier.

Scenario Outline: Baggage is processed according to its itinerary type

Given a baggage registration message <message>

When the baggage registrations are processed

Then the bags should be placed in the <workflow> workflow


| message | workflow |

| sydney_melbourne.txt | domestic-direct |

| sydney_melbourne-hobart.txt | domestic-transfer |

| sydney_wellington.txt | international-direct |

| sydney_toronto.txt | international-transfer |

In both of these scenarios, there will be a web page that monitors the state of registered bags, but that’s not really what’s being tested here. Testing application code is faster and more robust than exercising the UI, and the bulk of the business rules can often be safely tested directly against the application code, but such testing may inspire less confidence in the results. A small number of additional end-to-end UI tests can alleviate these doubts and verify that the baggage details are correctly displayed.

These scenarios are typical of many business-focused acceptance criteria, especially scenarios involving file processing, message routing, and so forth. In general, acceptance criteria should be of the end-to-end variety. They should, at least in principle, exercise all of the components of the system and verify how they work together. This is usually a more reliable way to build confidence in the acceptance tests. But testing specific business rules can often be done more efficiently within an isolated context, especially when an acceptance criterion is focused on a specific calculation or operation, with well-defined input parameters and easily identifiable outcomes.

The Cucumber step definitions for the first scenario might look like this:

Here you create a new baggage registration service and use it to transform the message into a BaggageRegistration object . Then you check that the generated registration object has the same values that you provided in the Then statement .

Most real-world Java applications would use a dependency injection library, such as Spring or Guice, to manage dependencies between the modules in a cleaner manner. As you saw earlier, Java-based BDD tools like Cucumber and JBehave provide integration with the main dependency-management libraries, allowing you to automatically configure and inject other production service classes into the service you’re testing.

9.3.3. Testing the service layer

Many, if not most, modern applications use some kind of service-based architecture. A service is a well-defined piece of functionality that can be called by other components or applications. Services generally act as facades, gateways, or aggregators for lower-level components (see figure 9.8). In many applications, the UI layer can’t invoke lower-level business or database components directly, but instead must use one of the services available in the service layer.

Figure 9.8. An example of an architecture using a service layer

Such services can be incorporated within a larger application, or they may take the form of discrete services that can be deployed and redeployed individually as needed and can be reused across multiple applications. This second approach is what’s generally understood by the term Service Oriented Architecture (SOA). In both cases, a well-designed service architecture allows services to be combined and reused in different parts of an application, or by different applications, to build and deliver new business functionality more quickly and more reliably.

The service layer plays a key role in good system architecture. You’ve already seen how many web applications are now built around sophisticated client applications, such as modern JavaScript-based single-page applications using frameworks like AngularJS and Backbone. And as you’ve seen, these client applications have their own complex business logic and behavior, and they invoke web services to obtain the data they need. Many mobile applications are implemented using a similar approach, with a native client application calling remote services.

In both these scenarios, the service layer can be considered as a separate API, with specific business goals and requirements. These requirements will be driven by the client applications that consume the services, and which may be developed by different teams. In BDD terms, the client application and the service API should be considered separately, each with its own set of scenarios and executable requirements.

A service-based architecture works well with BDD. By definition, services are designed to be reusable, so it makes sense to define them clearly and cleanly, and to document them with worked examples. When you design a new service using BDD, you begin by describing the requirements in the form of practical examples of how the service will be used. This tends to produce cleaner, more focused services that are easier to understand. If it’s hard to find clean scenarios with well-defined input parameters and expected outcomes, then the service being designed may need a little more thought.

This approach also makes the reuse of these services easier. Well-written BDD scenarios give a clear description of what a service is meant to do and how it works, and the automation steps give worked examples of how to interact with a service that new developers can use to get started quickly.

One of the most popular ways to implement an SOA architecture is to use web services. REST web services in particular are becoming increasingly widespread, though the more heavyweight SOAP web services are still widely used.

Let’s look at a few examples of working with both REST and SOAP in Java and .NET.

RESTful web services with Cucumber and Java

Imagine you’re designing web services for the Flying High Frequent Flyers application. One of the core services might be to display the details of a given flight. This service is required in several different screens and by the back-end payment processing system:

Feature: Retrieve information about a given flight

Scenario: Find flight details by flight number

Given I need to know the details of flight number FH-101

When I request the details about this flight

Then I should receive the following:

| flightNumber | departure | destination | time |

| FH-101 | MEL | SYD | 06:00 |

For simplicity, this example supposes that flight FH-101 is a well-known entity in the system. The key value in this scenario is discovering precisely what information you need to retrieve about a given flight.

Note that although you’re implementing a web service, this scenario is implementation-neutral and would be perfectly understandable by team members not fluent in JSON or XML. It’s also a good first approach to the problem that will help the team agree on the goals of a particular web service, its input parameters, and what information the service will provide.

You could automate this scenario using Cucumber in Java as shown here:

The automation code is relatively simple. You use a class written for your tests to access the web service and retrieve the result in the form of a Java object .

To verify the outcomes, you use Cucumber’s DataTable class , which provides a convenient diff() method to compare the contents of a table with the corresponding field values from a list of Java objects, using column headers as field names. This is a concise way of checking that theFlight object that you retrieved from the web service matches the values described in the scenario.

When it comes to automating the BDD scenarios that verify services, it’s common to use exactly the same libraries that are used to access them in the production code. This makes sense, because the automated steps act as examples for other developers to follow when reusing the services. In this example, you’re using JAX-RS and Jersey, the standard Java library used to implement web services and web service clients:

This level of feature is sufficient and appropriate for many scenarios. But some of the scenarios you write to describe a service may have a more technical audience than more business-focused scenarios. In some contexts, for example, if you’re implementing a JavaScript UI that relies heavily on JSON, it can make sense to provide sample JSON output as part of the BDD scenarios. In this environment, the target audience of the living documentation will be both the developers who are implementing the UI and the business, which will need to understand the business logic. If the JSON is simple and clean enough to make sense to both parties, there’s no reason not to include the JSON output directly in the scenario:

You could automate these steps in a similar way to the previous example:

Here you use the JSONassert library ( to compare the JSON data you receive from the web service with that given in the feature file. And once again, you delegate the actual code that invokes the web service to the FlightStatusClient class you saw earlier:

This is a simple example, but JSON is relatively readable, and more complicated scenarios are quite possible without compromising readability. The following scenario provides test data to be injected into the flight schedule database:

Providing sample output in JSON (or XML) form is a more technical approach that’s not suitable for all scenarios. But when the audience is technical enough to understand the output, and the JSON format will provide added value over simply returning the information, it can be beneficial to express requirements in this format.

There are many other ways to implement a service architecture, both within the context of a single application and in a broader SOA strategy. In the Java world, EJB 3 and Spring Remoting are other frequently used approaches. Teams using a .NET stack will typically use WCF (Windows Communication Foundation) or WebAPI.

9.4. Defining and testing nonfunctional requirements

BDD isn’t limited to purely functional requirements; the approach can also be used very successfully to discover and verify nonfunctional requirements. A nonfunctional requirement is traditionally defined as a (generally technical) requirement that’s not related to a particular feature but relates to how the application operates as a whole. Typical nonfunctional requirements include performance, stability, accessibility, and security.

Just like functional requirements, nonfunctional requirements should deliver value in some identifiable way, typically by increasing revenue or reducing costs. An unresponsive website will cost an organization in customers and sales. According to Amazon, 100 ms of page latency results in a loss of 1% of sales.[3] Google found that a delay of half a second resulted in a traffic drop of 20%.[4]

3 These figures were released in a presentation by Greg Linden, “Make Data Useful” (2006),

4 Marissa Mayer, Web 2.0 Conference, 2006.

BDD is a great way for teams to have a conversation about which nonfunctional requirements really matter for a particular application. BDD helps phrase these nonfunctional requirements in terms that stakeholders can understand and help define, and that can be easily verified and potentially automated.

In addition, not all nonfunctional requirements are equal. Expressing requirements as BDD scenarios can help stakeholders and developers determine what areas are most affected by nonfunctional requirements, such as performance or security.

For example, before doing any performance testing on an application, a team should ask end users and managers about their idea of acceptance performance metrics and the application’s anticipated load. Ideally, these expectations should be expressed in a way that’s meaningful to the business, but this isn’t always the case in practice. Many teams spend time and effort measuring performance metrics such as CPU and memory usage that aren’t necessarily related to the underlying business goals and don’t provide useful feedback to the business.

Let’s look at how this might work in practice. The Flying High Flight Status application relies on receiving timely updates from the flight status message server. One of the selling points of the Frequent Flyer program is its ability to provide clients with fast updates about flight status, including information about delays, expected arrival times, gates, baggage carousel numbers, and so forth.

From discussion with stakeholders and the marketing department, which had conducted market research and studied competing products, it emerged that updates needed to be dispatched within 30 seconds 95% of the time, and within 60 seconds 100% of the time, and that every update needed to be successfully dispatched.

You could express this performance requirement like this:

Feature: Application performance

Scenario: Flight status update performance

Given the flight status server is running

When a production peak hour of updates is sent at 5 times production


Then 95% of the updates should be received within 30s

And 100% of the updates should be received within 60s

And 100% of the updates should be received successfully

Once you’ve expressed the performance requirements in these terms, you can automate. There are many open source and commercial load-testing tools, and most can be scripted. Popular open source options in the Java world include SoapUI (, JMeter (, and The Grinder ( These tools let you define and execute test scripts that simulate interactions with an application, such as requesting a web page, invoking a web service, and querying an EJB. To simulate load, they can run scripts on a number of remote machines as well as locally. They’re also relatively easy to automate and to run programmatically. This makes them easy to integrate into tools such as Cucumber and JBehave.

Load testing like this needs realistic production-like data. In general, the best way to get an idea of the number of messages sent during peak times would be to observe the current production system, if one exists. Many teams even record a selection of production data for use in load testing. If you’re building the entire system from scratch, on the other hand, the team would have to agree on some figures based on the expected usage patterns.

Cucumber, Java, and The Grinder for real-world performance testing

When Tom Howard needed to verify the performance of an application built around a messaging platform for a large electricity company, he chose to use a BDD approach.[5] To do this, he used The Grinder and Cucumber-JVM to automate a BDD performance scenario similar to the one discussed here. The main Cucumber step invoked a Grinder script directly and waited for it to finish. The Grinder script ran a series of requests based on real production data from a number of worker machines to simulate production-like loads.

5 Tom Howard, “Cucumber-JVM + The Grinder + Math = Performance.Testing.Heaven,”

Once the script was done, the test results were recorded in a CSV log file. Tom used open source libraries to parse the CSV file and determine the 95th and 100th percentiles, which were in turn used to decide whether the scenario passed or failed.

It might be tempting to write automated tests for more low-level performance metrics such as memory consumption, CPU usage, average response time, and so forth. But this is generally not a cost-effective approach. During performance testing, a system can underperform or fail for many different reasons: insufficient memory, a slow database, network issues, suboptimal code, and so forth. In addition, in the real world, performance issues inevitably come from unexpected places, and it’s very hard to predict or test for every possible problem.

But in business terms, poor performance at any point will only really be of concern if it’s reflected in the performance requirements described in the BDD scenario, and this will be reflected by a failing BDD scenario.

A more effective approach is to monitor and record low-level performance metrics such as memory and CPU usage, but not to include these in any automated tests. If the BDD scenario does fail, then you’ve identified a performance issue that can potentially impact the business. In this case, you can use the recorded metrics to investigate and troubleshoot the issue.

9.5. Discovering the design

BDD is an excellent way to design clean, reusable service APIs, as well as clean, reusable APIs at all levels of the application. Well-designed applications are typically organized into layers and components (in design patterns, this principle is often referred to as the separation of concerns). If you think of an application in terms of these layers and components, starting from the UI and working down, each layer is the “user” of the next layer down, and any code that uses a component is effectively a user of that component. When the user is application code (or, more precisely, the developer writing the application code) rather than a physical person, you’re effectively writing an API. In fact, any code that you write can be considered an API for someone else, even if that someone is yourself.

Applying BDD principles to the actual implementation of these layers and components can help you write cleaner, better-designed, and more maintainable code. When you implement new code, you think about what data and services you need and how you’d ideally like to obtain them. If they don’t exist yet, you implement your code as if they do exist, and then use this code as the starting point for their implementation (see figure 9.9). This is a form of the Feature Injection idea you saw in chapter 3.

Figure 9.9. Writing the code you’d like to have. Here, you use the “quick-fix” feature available in most modern IDEs to get the IDE to generate the missing method for you.

For example, imagine you’re working on implementing a new feature for the Flying High Human Resources application. The requirement is to export an Excel spreadsheet containing a list of employees who have a birthday in the current week. The corresponding scenario might look like this:

Scenario: Export staff birthdays as an Excel spreadsheet

Given the following staff members:

| Name | Birthday |

| Joe | 10-Mar-1980 |

| Jill | 18-Dec-1965 |

| Jack | 20-Dec-1965 |

| Joan | 20-Nov-1991 |

And today is 16-Dec-2013

When I export this week's birthday list

Then I should obtain a spreadsheet containing the following:

| Name | Birthday |

| Jill | 18-Dec-1965 |

| Jack | 20-Dec-1965 |

As you implement the step methods for this scenario, you’ll use the scenario example to drive out the features you need from the next layer down.

This process doesn’t stop with the step implementation. When you write the production code, you can do exactly the same thing, discovering what services you need from other components (which may not exist yet), and writing the code you’d like to have. This act of imagining the code that would serve you best is a powerful design practice: what information do you need to get the job done? What services do you need to call? Do they exist, or do they need to be created? What would be the most convenient way to obtain the data you need, and in what form? When you write this imaginary code, you’re providing an example of how you’d like to use an API that doesn’t yet exist. Because you haven’t written any implementation code yet, this example will not be polluted by preconceived ideas about what the technical solution should look like; rather, it will be driven by what would be easiest to use from the perspective of the developer using the API.

In this way, BDD at the requirements level (where you use more communication-focused tools, such as Cucumber and JBehave) flows naturally to BDD at a more technical level (where you often use tools more traditionally associated with unit testing). We’ll study this lower-level form of BDD in much more detail in the next chapter.

9.6. Summary

In this chapter you learned about different sorts of non-UI acceptance testing, including

· When you should use UI tests and when it’s more appropriate to use non-web tests

· How to write BDD scenarios that bypass the UI and exercise the controller layer, service layer, or business logic directly

· How to write BDD scenarios for remote services such as web services

· How to use BDD scenarios to describe and verify nonfunctional requirements, such as performance

· How BDD techniques at the requirements level help encourage clean, well designed APIs, and lead naturally to lower-level BDD practices at the unit- and integration-testing level

In the next chapter, you’ll learn how BDD practices are also applicable at the coding level, and how this sort of lower-level BDD relates to Test-Driven Development and more traditional unit-testing practices.