The Practice of Test Automation - Testing with F# (2015)

Testing with F# (2015)

Chapter 1. The Practice of Test Automation

Test automation is a practice that will make you think differently about coding. A typical non-tester approaches a problem by squabbling about some code in the editor and changing it until it works. Like working with clay, you start from a lump and carefully craft it into a bowl, and once satisfied, let it dry. Once it has dried, there is no way you can change it.

When you start doing test automation, you will quickly identify the key issues with how you've been writing code before:

· You start writing code on a blank sheet without any clear intent on the result

· You don't know when it's time to stop writing code

· You don't know whether your code will keep on working when you add more code

Test automation comes to grips with these issues and provides a process for writing code in a more structured and organized fashion. You start out with a clear intent, implement the code until your tests are green, and refactor it until you're happy with the end result.

Functional programming will open your mind to the flaws in the code you've written previously. You will find that the number of programming errors are reduced when your code becomes stateless. Complexity is reduced by removing the deep object dependency graph from your application. The intent gets clearer when all it consists of is functions and data, where functions operate on data.

Together, test automation and functional programming is a harmonious match that brings together good coding practice with good code, making you, the programmer, fall into the pit of success. By reading this book, you will understand how to combine the two and become a better programmer.

In this chapter, we will cover the following topics:

· What is testing

· The purpose of testing

· Testing with intent

· Writing regression tests

Testing as a practice

Before diving into why we need test automation, we should consider what it really is. The practice is still quite new and there is some confusion surrounding it, leading to developers testing the wrong thing and managers not knowing what to expect.

Black or white box testing

Testing practices are often split into black or white box tests. The difference is distinguished by how much we know about the system we're testing. If all we know about the system is what we can see from the outside and all we can do with it is build outward interfaces, then the method of testing is considered a black box test.

On the other hand, if our testing knows about the inward functions of the system and is able to trigger events or set values within it, then this testing is referred to as white box testing.

These are two slightly different viewpoints to consider when testing. When performing test automation, we need to examine both black and white box testing, where white box testing is closer to the implementation and black box testing is often leaned toward based on a user requirement's abstraction level.

Manual testing

Manual testing is a practice used to investigate a product and determine its quality. This is done by a person called a tester and is performed by executing the program or using a tool to examine it. The testing will validate that the product meets its requirements and determine if the system is usable, but most importantly, it will validate that the product solves the problem it was created for.

The following image shows how testing fits into the normal flow of software development:

Manual testing

The result of manual testing is a set of issues that gets reported to the development team. Some of these issues are labeled as bugs, defects, or just changes. The tester will rate them based on priority (blocker, critical, high, medium, or low) and incorporate them into the development process.

The term manual testing is usually just called testing, but to avoid confusion, I will refer to testing done by a tester as manual testing and testing that is executed by a computer as test automation.

Test automation

Test automation is a practice used to create checks that will verify the correctness of a product. These checks are written in code or created in a tool, which will then be responsible for carrying out the test. The nature of these checks is that they are based on the requirements and are reproducible through the automation.

The following image shows how test automation doesn't require a tester or need for reports and issue tracking:

Test automation

Most commonly, test automation is performed by the development team and is an integrated part of the software development process. It doesn't replace the tester but puts an extra layer of quality assurance between the team and tester, leading to fewer issues reported by the tester.

The best kind of testing is that which requires little effort. The code is reviewed by the computer when compiling the program, verifying that it's possible to turn the code into machine instructions. For a statically typed language, this can be seen as the first line of testing, like a spell check.

Once the code is compiled, the programmer understands that the code will be executed. It will not necessarily do what it's supposed to do, but it's guaranteed to execute, which is not always the case if interpreted at runtime.

The following table shows the layers of testing and what they verify:

Test activity

Input

Verifies

Compiling

Source code

Syntax correctness

Style check

Source code

Code style

Static analysis

Source code / compiled assembly

Code correctness

Unit testing

Compiled assembly

Code correctness

Integration testing

Compiled assembly

Code behavior

System testing

Release version

Product behavior

Style check

A style check on the code will ensure it is properly formatted and enforces conventions such as the name standard, indenting, comments, and so on. This is very valuable in a team setting to increase readability of the code and maximize code sharing, as all developers use the same coding style. The result is higher quality and less friction, leading to fewer bugs and faster development.

For F#, there is a style-checking tool called FSharpLint, which is available through the NuGet package manager and can be used to check your code against style conventions.

Static analysis

Static code analysis can be used to avoid unnecessary mistakes, including unintended circle references or badly implemented patterns, such as poor implementation IDisposable. It helps in avoiding problems that inexperienced developers would have with garbage collection and threading.

Tip

There are no good static analysis tools for F# as of this writing. In C#, one could use Visual Studio Code Analysis, previously known as FxCop, for static analysis.

Unit testing

Unit tests are written at the same time as the code, before or after. They verify that the code produces the intended result. This is a form of white box testing that seeks to reduce the number of unintended defects that come out of development. If the unit testing is thorough, the code will do what the programmer intended.

Here's an example unit test:

open NUnit.Framework

open FsUnit

[<Test>]

let ``should return 3 from adding 1 and 2`` () =

Calculator.add 1 2 |> should equal 3

Integration testing

An integration test is a test written by the programmer to verify his or her code's integration with other systems or parts of the system, such as databases and web services. The purpose of this testing is to find problems and side effects that only appear in the integration with those other systems. If integration testing is thorough, it will help with the stability of the system.

Here's an example integration test:

open NUnit.Framework

open FsUnit

[<Test>]

let ``should store new user to data storage`` =

// setup

let newCustomer = { name = "Mikael Lundin"; address = "Drottninggatan 82 Stockholm" }

// test, storing new customer to database

let customerID = CustomerRepository.save newCustomer

// assert

let dbCustomer = CustomerRepository.get customerID

dbCustomer |> should equal newCustomer

System testing

System testing is a form of black box testing that is performed in order to validate that the system requirements, both functional and nonfunctional, are fulfilled. System testing is a very broad term and is more often pushed to manual testing than it is automated. Executable specifications is one area where system testing automation excels.

Building trust

What you see as a developer when you look at legacy code is distrust. You can't believe that this code is performing its duty properly. It seems easier to rewrite the whole thing than to make changes to the existing code base.

The most common type of bug comes from side effects the developer didn't anticipate. The risks for these are high when making changes in a code base that the developer doesn't know. Testers have a habit of focusing their testing on the feature that has been changed, without taking into account that no change is done in isolation and each change has the potential to affect any other feature. Most systems today have a big ball of spaghetti behind the screen where everything is connected to everything else.

Once, I was consulting for a client that needed an urgent change in a Windows service. The client was adding online payment to one of their services and wanted to make sure customers were actually paying and not just skipping out on the payment step.

This was verified by a Windows service, querying the payment partner about whether the order had been paid. I was going to add some logic to send out an invoice if the online payment hadn't gone through.

The following is the invoice code:

// get all orders

OrderDatabase.getAllUnpaid()

|> Seq.map(fun order ->

// for each order

let mutable returnOrder = order

let mutable orderStatus = OrderService.NotSet

try

// while status not found

while orderStatus = NotSet do

// try get order status

orderStatus <- OrderService.getOrderStatus order.Number

// set result depending on order status

returnOrder <-

match orderStatus with

// paid or overpaid get correct status

| Paid | OverPaid -> { order with IsPaid = true }

// unpaid

| Unpaid | PartlyPaid -> { order with IsPaid = false; SendInvoice = true }

// unknown status, try again later

| _ -> returnOrder

with

| _ -> printf "Unknown error"

returnOrder)

// update database with payment status

|> Seq.iter (OrderDatabase.update)

It was implemented and deployed to a test environment in which the logic was verified by a tester and then deployed to production, where it caused €50,000 in lost revenue.

With my failure, I was assuming the OrderService.getOrderStatus parameter really worked, when in reality, it failed four out of five times. The way the service was built, it would just pick up those failed transactions again until it succeeded.

My addition to the code didn't take the side effect into account and started to mark most failed payments with the status of Paid even though they were not.

The code worked fine while debugging, so I assumed it was working. The code also worked fine while testing, so the tester also assumed it was working. Yet, It was still not enough to stop a crucial bug-hit production.

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Bad code is that which is poorly written and does not follow best practices by swallowing exceptions and letting the program continue to execute in a faulty state. This makes the code harder to change, and the risk becomes higher as a change could introduce new bugs.

Tests written for a program will guarantee that the code has better structure and is easier to change. This is because tests themselves require well-structured code in order to be written. Unit tests drive code to become better designed with higher quality and easier to read and understand.

Integration tests will verify that code written to integrate with external systems is well-written with all the quirks the external system needs, and regression tests will verify that the intended functionality of a system be kept even after a change has been introduced.

Building trust with programmers is all about showing robustness, and this is done by tests. They lengthen the lifetime of a system, as those systems are open to change. They also shine through to the end user, as those systems will not crash or hang when the unexpected occurs.

The purpose of testing

When starting to learn about test-driven development, many developers struggle with the question: "Why are we doing this?" This is also reflected in the tests they write. They write tests to verify the framework they're using, or for simple trivial code. They also write brittle tests or tests that are testing too much. They have not reflected on why they're testing and often only do it because they've been told to, the worst kind of motivation.

The value of testing is shown in the following image:

The purpose of testing

The original illustration comes from a talk by Martin Fowler on refactoring. The title is Why refactor? and the same applies to testing. The value of testing comes not from quality, clean code, professionalism, or that it is the right thing. The value is economics. You write tests in order to save money. Bad programming will lead to bugs in your software, which can have the following consequences:

· Projects running over time: This is because the team spends time on fixing bugs instead of writing new features. Bugs become a bottleneck for productivity.

· Corruption of data: The cost of retrieving lost data or a bad reputation for losing customer data will have substantial economic consequences.

· System looking unpolished: Your software will behave irrationally and the users will stop trusting your product. They will take their business elsewhere, to a competitor that doesn't let bad quality shine through.

We need to avoid bugs in order to avoid unnecessary and hard-to-predict costs. By adding testing to our process, we create predictability and reduce the risk to software development projects.

Note

ObamaCare, officially named The Patient Protection and Affordable Care Act, is a law signed on March 23, 2010, in the United States. It was aimed at reforming the American healthcare system by providing more Americans with access to affordable health insurance.

The US government issued a website where people could apply and enroll for private health insurance through ObamaCare. However, the launch of the website was a dead fish in the water.

Not only was the site unable to handle the substantial load of visitors while going live, but it was also having to solve performance problems for several months. The site sent personal information over unencrypted communication, and the e-mail verification system was easily bypassed without any access to a given e-mail account.

An estimation of 20 million Americans experienced the broken ObamaCare site, seriously hurting the reputation of software developers worldwide. By writing tests for our code, we will achieve higher quality, cleaner code, and maintain a higher level of professionalism, but what it eventually boils down to is that the code we write will have greater value. Tested code will:

· Have fewer bugs: Bugs are expensive to fix. The code will be cheaper in the long run.

· Be better specified: This leads to fewer changes over time. The code will be cheaper in the long run.

· Be better designed: Bad code can't be tested. The tested code will be easier to read and less expensive to change.

All of these points of interest lead to predictability, a precious thing in software development.

When not to test

As a part of software development mentoring teams, I tell developers to test everything because they always seem to find some excuse for not writing tests.

Always write tests for your code, except if the following applies; if it does, then it makes no sense to test it:

· The code will never go into production

· The code is not valuable enough to spend tests on

· The code is not mission-critical

The most common excuse developers have for not writing tests is that they claim it is too hard. This holds true until they've learned how to, and they will not learn unless they try.

Testing with intent

There are several angles to go about writing tests for code, and it is important to understand them before you start avoiding some of the bad practices. Tests written without a clear intent by the programmer are often characterized as being too long or asserting too much.

Asserting written code

The most important aspect of unit tests is to assert the code has the intended result when executed. It is important that the author of the tests is the same as that of the code, or some of the intent might be lost in the process.

The following is a code snippet:

// System Under Test

let div x y = x / y

// Test

div 10 2 |> should equal 5

This might state the obvious, but a developer could easily mix up the order of incoming arguments:

// System Under Test

let div y x = x / y

// Test

div 10 2 |> should equal 5

Running the test would expose the following error:

NUnit.Framework.AssertionException:

But was: 0

Expected: 5

Tests give the developer a chance to state what is not obvious about the code but was still intended:

// System Under Test

let div x y = x / y

// Test

div 5 2 |> should equal 2

(fun () -> div 5 0 |> ignore) |> should throw typeof<System.DivideByZeroException>

The test verifies that the remainder of the integer division is truncated, and that the code should throw an exception if you try to divide 5 by 0. These are behaviors that are implicit in the code but should be explicit in the tests.

Writing these assertions is often a faster way to verify that the code does what was intended than starting a debugger, entering the correct parameters, or opening up a web browser.

Contracts versus tests

There is a technique called Design by Contract (DbC) that was invented by Bertrand Meyer while designing the Eiffel programming language. The basic idea of DbC is that you create contracts on software components stating what the component expects from the caller, what it guarantees, and what it maintains.

This means that the software will verify the acceptable input values, protect them against side effects, and add preconditions and postconditions to the code at runtime.

The idea of software contracts is very attractive, a few attempts at implementing it for the .NET framework has had limited success. The heritage of DbC is defensive programming, which simply means the following:

· Checking input arguments for valid values

· Asserting the output values of functions

The idea behind this is that it is better to crash than to continue to run with a faulty state. If the input of the function is not acceptable, it is allowed to crash. The same is true if the function is not able to produce a result, at which time it will crash rather than return a faulty or temporary result:

let div x y =

// precondition

assert(y > 0)

assert(x > y)

let result = x / y

// postcondition

assert(result > 0)

result

Assertions such as these cannot be seen as a replacement for testing. The differences are pretty clear. The contracts are validated at runtime when debugging the code, but deactivated when compiling the code for release. Tests are written outside the main code base and executed on demand.

With good assertions, you'll find more problems when doing manual testing, as the risk of running tests with faulty data is much smaller. You will also get code that is better at communicating its intent when all the functions have a clear definition of the preconditions and postconditions.

Designing code to be written

Testing your code is also an exercise in making it modular to enable it to be called from outside its original context. In doing so, you force the application to maintain an API in order for you to properly test it. It should be seen as a strength of the methodology that makes the code more concise and easier to read. It also enforces good patterns such as the single responsibility principle and dependency injection.

There is a reason for making use of test-driven development using the mantra red, green, refactor. The refactor part of testing is essential to create a successful test suite and application. You use a test to drive the design of your code, making it testable and achieving testability:

let rec crawl result url =

// is duplicate if url exists in result

let isDuplicate = result |> List.exists ((=) url)

if isDuplicate then

result

else

// create url

let uri = new System.Uri(url)

// create web client

let client = new WebClient()

// download html

let html = client.DownloadString(url)

// get all URL's

let expression = new Regex(@"href=""(.*?)""")

let captures = expression.Matches(html)

|> Seq.cast<Match>

|> Seq.map (fun m -> m.Groups.[1].Value)

|> Seq.toList

// join result with crawling all captured urls

List.collect (fun c -> crawl (result @ (captures |> List.filter ((=) c))) c) captures

This program will get the contents of a URL, find all the links on the page, and crawl those links in order to find more URLs. This will happen until there are no more URLs to visit.

The code is hard to test because it does many things. If we extract functions, the code will be easier to test, have higher cohesion, and also be better in terms of the single responsibility principle.

The following code is an example of extracted functions:

// item exist in list -> true

let isDuplicate result url = List.exists ((=) url) result

// return html for url

let getHtml url = (new WebClient()).DownloadString(new System.Uri(url))

// extract a-tag hrefs from html

let getUrls html = Regex.Matches(html, @"href=""(.*?)""")

|> Seq.cast<Match>

|> Seq.map (fun m -> m.Groups.[1].Value)

|> Seq.toList

// return list except item

let except item list = List.filter ((=) item) list

// merge crawl of urls with result

let merge crawl result urls = List.collect (fun url -> crawl (result @ (urls |> except url)) url) urls

// crawl url unless already crawled it

let rec crawl result url =

if isDuplicate result url then

result

else

(getHtml url) |> getUrls |> merge crawl result

The functionality is the same, but the code is much easier to test. Each individual part of the solution is now open for testing without causing side effects to the other parts.

Writing tests for regression

When developers try to convince managers that testing is something that is necessary to their project, regression is often a card that is drawn. Managers may claim the tests can be run in a build server to make sure functionality is continuously verified. While this is true, it is more of a side effect, unless tests are written for this specific reason.

A good regression test states something about the functionality that is always true. It should not be so much dependent on the implementation but on the specification of the functionality.

I was once working with a client on a system that was fairly complex. It was an order process that was divided into several steps to minimize the complexity for the end user, but still, there were hundreds of business rules implemented in the backend.

I was working alone one evening in the office when a business analyst came rushing in, claiming that I needed to take the website down. After some querying, he told me that the discount logic for students was wrong.

With the business analyst standing over my shoulder, I went into my test suite and found the following regression test:

PriceCalc_Should_Discount_Students_With_Child_Under_16_y o()

The test was turning into green as I ran it, and I asked him for his test data. It turned out he was using his own personal information as test data and had a daughter that had recently turned 16.

One peculiar observation about bugs is that they have a tendency to come back, unless carefully observed. This is why it's always best to write a regression on finding a bug to make sure it doesn't reappear. Personally, I write these tests to verify the claim of the bug and then use the test to show me when the bug is fixed.

Executable specifications

Written tests inform from an outside perspective how the system is behaving. This is a very powerful concept that, if enriched, will lead to tests as specifications. Looking at the tests will tell you how the system works.

Having a large test suite could easily replace the requirements and specifications of the system, as the test suite verifies the stated functionality every time tests are run. The documented specifications and requirements become outdated after the first change.

I was once consulting for a client that was going to sell gym memberships online. The implementation itself was not that hard: gather customer information and store it in a Customer Relationship Management (CRM) system. The credit card payment was hosted by an external payment provider and integrated with some basic HTTP redirects.

However, the client was insistent on having very complex price logic to a degree where it was impossible for one person to understand why a membership had been assigned a target price.

In order to implement this, I started from the requirements and wrote them all as tests. I felt confident that my test suite covered the whole problem and would implement the system in such a way that my test would turn green.

Turning over the solution to the client for User Acceptance Testing (UAT), I got back 10 scenarios where the client claimed the membership had the wrong price.

Still confident in my method of implementation, I chose to implement all failing scenarios as tests. It proved that the code was not wrong, but the logic was so complex that the client couldn't verify it in UAT.

After some iteration with this, the client finally gave up acceptance testing and had us release it to production. As a consultant, I should have advised my client to simplify their price logic.

What tests are trying to achieve as specifications is to have executable specifications written in a natural language that can verify the following:

· The code is implemented as specified

· The code keeps fulfilling the specification (regression)

While specifications are being written in natural language, it is possible for programmers and business analysts to have a common workspace on how the system is supposed to work.

The following example shows the specifications written in a natural language:

Feature: Authentication

Scenario: Entering correct login information makes user authenticated

Given a fresh browser session at http://mikaellundin.name/login

When entering 'mikaellundin' as username

And entering 'hello fsharp' as password

Then browser should redirect to http://mikaellundin.name/profile

The specification is written in a Domain Specific Language (DSL) called Gherkin. Each line has code connected to it that will execute when the specification itself is executed to verify that the requirement is fulfilled.

Summary

Let's say you're at the airport by the self-service check-in trying to print your boarding card. The machine does not accept your booking number at first, but after a few retries, you're able to check in. After confirming, the machine hangs before printing your boarding card and you're not sure whether you've checked in or not. You move on to the next machine to try again.

In our society today, we put so much of our faith in machines. They handle everything for us, from flying airplanes to shopping online and paying the bills. It is when it doesn't work that we stop in our tracks and reflect on the fact that while the machine might be perfect, the programmer is not.

The reason behind testing is to create stability, predictability, and quality in our software. Writing tests reduces the number of bugs produced and the number of bugs found by our testers.

We write tests to make software cheaper. We do this because bugs are expensive. We do this because change is expensive. And we do this because we would rather go slowly and methodically in the right direction, than very fast down the wrong lane.

In this chapter, we touched upon what test automation is and why it's necessary. The next chapter will look at functional programming and how it makes testing a breeze.