The Controversy of Test Automation - Testing with F# (2015)

Testing with F# (2015)

Chapter 7. The Controversy of Test Automation

Testing is hard. It is hard to find the right balance in testing to know what to test and what to skip. It is also hard to write high-quality tests that bring more value than the effort it took to produce them.

However, the value of a good test suite is tremendous and shouldn't be neglected by fear of learning or challenging what is difficult. This chapter will focus on the difficulties and teach you how to think about testing and quality measures in general.

In this chapter, we will cover the following topics:

· Bugs or defects

· The cost of quality

· The false security of code coverage

· Test-driven development

· Testing or fact-checking

After this chapter, you will feel confident about applying test automation to your project. This will help you to not only decide when to write a test and when to sustain, but also to know how to test. This chapter is a fast track in the experiences of test automation and quality in software development.

Bugs or defects

There are some glossary terms for software testers regarding how they should label the result of their testing, namely error, fault, bug, or failure. These do not completely apply to test automation. As the terms fault and bug are interchangeable, I would consider the list antiquated.

As programmers, we see the code from an inside-out perspective and do not think of familiarizing ourselves with software testers' terms. This is also why we have one common name for a fault, and that is bug. The problem with this term appears when clients find a bug and you as a developer insist it's the intended behavior.

It is a common misconception that code can only be right or wrong. In reality, however, there are many gray zones of software failure that can cause real trouble.

The following image is a map around quality in a software project:

Bugs or defects

A software failure can come from missing or vague requirements. It can also come from weak specifications that don't make it clear what to expect of the features developed. It is common that the expectation of the product owner is different from the team that is actually developing the feature. Then, there is development done by developers that sometimes has human errors, for example, typos.

The quality of a product is a contract between the product owner and the development team, and what holds it all together is process and communication.

Bugs

Let's say the tester on your team comes to you and says, "I expected to get an updated total price when I put the products in the shopping cart." If this report comes as a surprise because you thought you'd written it that way, then it's a bug.

Bugs can be described as a scenario in which the code was intended to do X, but instead does Y. This means that the developer has made a mistake and this has resulted in a bug. Both the tester and developer agree that there is a fault in the software.

The following are examples of bugs in a web application:

· Clicking on the Submit button should submit the form, but it doesn't seem to do anything

· No confirmation e-mail is sent out at the end of the order process

· After increasing the quantity of line items in the shopping cart, the quantity number resets to the previous number after the page is reloaded

The strangest observations I've made about bugs are the ability for them to reappear. A bug is fixed and the solution is deployed to production. The client confirms that the bug is fixed and the issue is closed in the bug-tracking software. However, after the next deployment, the bug is back and no one can explain how it happened.

How to avoid them

No programmer is perfect; even we often fail to recognize bugs in software projects. Instead, we have a healthy number of errors produced for every thousand lines of code. There are several reasons for this:

· The programmer is human and can make errors

· The feature is rushed because of the project plan

· The work environment contains distractions

· The requirements or specification is misunderstood

· The developer is not well-versed in the technologies used

I have yet to meet a programmer that doesn't produce bugs.

A while back, I was working with a client who had very little IT project experience. The project was running late because of a high number of bugs that were found during the user acceptance test. The client came up with the following suggestion:

Why not do it right the first time?

The project manager guffawed over the client's ignorance and went into a rampage on how it was impossible to work with these inexperienced product owners. My answer to the project owner was as follows:

The client is right. We should do it right the first time, but that would require a higher initial investment in quality.

The project manager walked away.

The writing of this book is a similar scenario. I have a tool that checks spelling and grammar so I can correct them before sending the text to my editor, who will then proofread and give suggestions on how to change the content in order to better meet the requirements of my readers.

All code should be considered a draft until it has been through the quality assurance process the team has previously decided upon.

The following table describes different quality measures that can be used to validate code:

Quality measure

Description

Statically typed language

Prevents lexical and semantic errors by compiling the code

Static code analysis

Prevents logical errors by parsing the code to match the rules

Test automation

Prevents logical errors by verifying the code and writing consuming code

Code review

Prevents logical errors by having another developer review the code

System testing

Verifies that the product fulfills the requirements

There will always be errors in the code we write. As software developers, we have to accept that we're the source of errors in our software and we need to do what we can to mitigate this.

The only way to avoid bugs from coming back is to use the method of writing tests that verifies the bug is fixed and makes sure the test will run for every subsequent commit. Then, you don't have to deal with the embarrassment of explaining to the client why a bug has reappeared after being fixed and needs to be fixed again.

Defects

Imagine a feature has been released for a user acceptance test. It has been implemented according to the specification and fulfills the requirements. Still, the client comes back saying, "This is not as I expected." This means the feature has a defect.

A defect can be described as the state in which the developer intended the code to do X and the code does X, but the client was expecting Y.

The client will call it a bug, but the developer will claim that the code does as intended. The project manager will then come in and call it a change in order to charge the client for it.

The following are examples of defects in a web application:

· The client expected to be redirected back to the start page after a successful shopping cart checkout

· The client expected they wouldn't be able to put more products into the shopping cart than there is in the inventory

· The client expected that clicking the logo of the web page would take them back to the start page

The problem is in the way that features are communicated. It's caused by a faulty alignment between expectations of the client and the implementation of the developer. This might not be considered a code problem, but it is a huge problem for software as it represents the first line of changes to the code that will cause it to be more complex.

How to avoid them

The only way to avoid defects is to ensure the team and client are working in close collaboration, both in planning and on a day-to-day basis. The client should be included in daily standups and approving the feature specifications so there are no surprises coming out of development.

The following points explain how to avoid defects:

· Create detailed specifications

· Have the product owner and tester review the specifications

· Ensure the client and team are working in close collaboration to align expectations

Fewer defects will lead to fewer changes, which will lead to higher quality of the code. This quality will lead to fewer bugs. Communication is an all upward spiral of quality.

The difference between bugs and defects

In order to avoid bugs, we need to improve the coding process, and we can do this with a variety of tools that will verify the programmer hasn't committed any faults.

In order to avoid defects, we need to improve the process and communication between the development team and product owner to ensure that they are aligned with regard to requirements and specifications.

The cost of quality

Bugs are the major unpredictable factor in software development projects. It's what makes projects run late. Wouldn't it be great if we could write bug- and defect-free software? Imagine a project where no testers are needed because no bugs are created. Is this even possible? Is it something to strive for?

We are a young and immature industry. This is obvious from how many projects are running late, over budget, or simply canceled. We try to run our projects as if we're building bridges, and we like to compare our industry professionals to surgeons.

Our bridges, they fall, but we can't blame the materials or force majeure. We can only blame ourselves and our own ignorance, and vow that next time, we will focus more on quality and build a better bridge.

Quality index

The software development quality index is the number of bugs and defects produced for each Thousands of Lines of Code (KLOC).

The following image describes the software development quality index:

Quality index

The quality index should be as low as possible. It is a relative indicator of the quality of the solution, and It is best measured over time in order to make decisions regarding quality measures.

The following image shows how the quality index can be measured in a Scrum project:

Quality index

The quality index is directly proportional to the amount of code and bugs in that code. If you measure these things by each sprint, you can visualize the quality of the software in a simple line chart.

If the quality index is decreasing, it means there are fewer unresolved bugs for each KLOC. This could mean that the quality of the product is increasing, or it could mean that the quality of testing is decreasing. This is difficult to determine, but a decreasing quality index is preferred.

If the quality index is increasing, it means there are more unresolved bugs for each KLOC in the product. This could mean that the quality of the product is decreasing. It could also mean that the quality of testing is increasing.

The quality index is directly affected by the ability of the team to solve bugs. If the team doesn't have time to solve bugs during the sprint, the quality index will increase and the technical debt will build up.

The quality index is a black box measurement. It doesn't take code quality or practices into account. If there are no bugs in the software, it is of high-quality. If the team is able to manage a product with bad code and practices and still keep the number of bugs down, the quality index will be low and the product will be presumed to have high-quality.

The software craftsman

Software craftsmanship is a movement that compares developers with carpenters. You start out as an apprentice learning the craft, become intermediate after delivering some successful or failed projects, and eventually end up being a master.

The master software craftsman is always up to date with the latest technologies, pushing the boundaries of his or her team and the solutions produced. He or she maintains a weblog, attends conferences, and runs a few open source projects in the evenings.

These craftsmen have a disease. This disease makes them strive for quality because of their professional conduct. They don't want to be seen delivering software that is not in its prime condition, so they strive for quality for quality's sake.

Instead of seeing the value of the features, they focus on the value of the code. They spend hours setting up continuous integration, Test-driven Development (TDD), automation, and processes. These things do not deliver value to the client, unless it helps in getting features deployed to production. This is shown in the following image:

The software craftsman

Features start out as requirements and these requirements are elaborated into specifications. The specifications turn into code and the code turns into a test. Tests turn into a pre-production environment and then turns into a live feature. This is when users first reap the benefits of the feature, as it generates value. Every step before the user is a cost, which should be considered waste if no one is using the feature.

Long before thinking of quality, one should consider the following:

· How much value does the feature bring?

· How many users will the feature have?

· How long do we expect the feature to exist?

Then, we can start discussing how large of a quality effort is needed to bring the most out of the feature.

Not all code is created equal

Not all developers want to create the perfect solution to the perfect problem. Most actually just want to do a good job, meet their deadlines, and go home in the evening to their families. The software craftsman finds it a waste of time to not spend the evening on research or open source projects. There is, however, something admirable and predictable about the 9–5 programmer.

The success of a software project is about creating predictability, and this is not only done by processes, but also by composing a team of predictable people. Not everyone in the team will be a rock star, but every team member should bring predictable results. Knowing your team and stocking it with the right people is the path to success.

All programmers have their own tools and skillset. They will attack a problem based on their own experiences. This makes code we write very diverse. And there is not one solution to any given problem, not even a better solution. There are as many solutions as there are programmers. This will all lead to some diversity in the quality produced by the team members.

Some developers write their tests first and others do this at the end. Many developers might not want to write tests at all. It is a challenge to find common ground and select the practices that may be done based on individual preferences and those that must be enforced.

Being a pragmatic programmer

A duct tape programmer is one that rolls up the sleeves and gets down to business no matter how ugly the situation might be. This type of practice is the opposite of finding a generalized solution to the type of problem you want to solve. Instead, you look at what's asked for and use the bare minimum to make it work.

I was once paired with a C# developer, writing a self-help service for the client's website where they could log in, look at their issues, and track progress. He needed a way of parsing out e-mail addresses from an input string and wanted my help. I had already been doing this same thing for the backend service, so I simply told him to copy my code into his solution.

However, the developer said, "This is not Don't Repeat Yourself (DRY). Should we reuse what you did in the backend?" I said, "Yes, we're doing that by copying my code." It meant we had to maintain the code in two places, but that could be done at a lower cost compared to implementing a SOAP web service in order to expose the regular expression for the e-mail.

We ended up copying the code, violating DRY for the greater good.

The value of the code is in the feature being used. This value decreases when there is an extensive amount of maintenance needed to be done on that code. This is the risk of duct tape solutions. If it breaks, you'll spend an unreasonable amount of time fixing it. Still, the generalized solution often requires time in creation, and it still doesn't promise to save you time on maintenance.

Good enough

Bugs cost money. They also cost time that could be better spent working on a new functionality. They are responsible for pushing deadlines, and they cause the downtime of your site. A bug is associated with not just annoyance, but cost.

If you can put a price on a bug, you can also decide how much money you're willing to spend in order to avoid one.

Before starting a software development project, I usually ask the client what they wish to prioritize out of the following attributes:

· Cost efficiency

· High-quality

· Many features

It gives a good direction for the team to go in when planning and specifying features. If the client answers that they want all 3 in a healthy mix, you know you're in trouble.

I had this rare client that put quality above all. Nothing else mattered, and we could spend as many hours as we'd like as long as it provided a qualitative solution.

I didn't want to challenge the client on this, but still I asked why this was the case, and the client made this argument: We have 5,000 orders from the website any given Monday. Each order is worth 50€. This is an income of 2,50,000€ every day; when divided throughout the day, it becomes roughly 10,500€ an hour. We hire you for 100 € an hour, which means you could spend 105 hours on quality if it were to give the website 1 more hour of uptime.

Needless to say, the requirement was 99.995 percent uptime, which meant they allowed 25 minute planned maintenance every year.

There are many different quality improvements that can be applied to a code base or development process. Each will cost a little and improve on the quality index. As a development team, it is important to pick the low-hanging fruit first, so to speak, in order to reap big benefits with little effort.

More quality improvements will eventually lead to less benefits, and there is a time when a team should stop working on improving the quality and focus on delivering the feature.

The following image shows how the value of quality measures decreases:

Good enough

At the dotted line, the amount of effort spent is equal to the value gained. Spending any more time in testing would be a waste.

An experienced developer will know where the spending of quality improvements matches the cost of the bugs and where the cost of quality improvement exceeds the value returned.

In a project I was running as a Scrum Master, we always took the opportunity at the start of the sprint demo to review how many bugs were created and solved them during the sprint. The following graph is from this very project:

Good enough

With these metrics, you can easily determine how much time was spent on fixing bugs and come up with a number regarding how much these bugs cost the development project with each sprint. This would give you a good estimate on how much effort should be spent in order to avoid these bugs.

Technical debt

Writing code is never easy, and it doesn't get easier when you have external dependencies in the process, such as deadlines. The project manager will likely ask you to cut some corners in order to deliver the feature to the client as per the deadline. This will make you skip some of the quality processes, such as tests and peer review, in order to deliver faster.

The result might be delivered faster, but with lower quality and some built-up technical debt.

The debt you owe to quality will cost the project interest for every feature added. This will make the features more expensive and make bugs appear more easily. This will be the case until the debt is repaid. Repaying a debt is done by refactoring the code and adding the quality measures that were skipped in order to deliver faster. Doing so will be much more expensive after the feature has already been delivered, and this is the price that has to be paid for cutting corners.

Technical debt is not at all bad. It will enable you to make a choice. Is it worth cutting corners on quality in order to meet a deadline? Sometimes the cost of fixing the code after the deadline is worth the debt. In that case, one should opt for the debt, but not forget to pay it off. This is where many software projects fail.

The false security of code coverage

When doing test automation, you can use a tool that will tell you how large a part of your code is covered by tests. This may seem like a good idea at first glance, but does have some hidden dangers.

The following image shows the code coverage tool, NCover:

The false security of code coverage

The code coverage report will show you what code was traversed and how many times, but it's a false assumption that the report will tell you what was tested. How many times is it required that a test pass a line of code in order to call it covered? Consider the following example:

open NUnit.Framework

open FsUnit

// System Under Test

let rec fibonacci = function

| n when n < 2 -> 1

| n -> fibonacci (n - 2) + fibonacci (n - 1)

let fibonacciSeq x = [0..(x - 1)] |> List.map fibonacci

[<Test>]

let returns_correct_result () =

fibonacciSeq 5 |> should equal [1; 1; 2; 3; 5]

The test is written only to traverse all the code in one go. It doesn't have a clear purpose of what it is trying to test, and it doesn't specify what is expected of the functionality:

[<Test>]

let ``should expect 1 to be the first fibonnaci number`` () =

fibonacci 0 |> should equal 1

[<Test>]

let ``should expect 1 to be the second fibonnaci number`` () =

fibonacci 1 |> should equal 1

[<Test>]

let ``should expect 5 to be the fifth fibonnaci number`` () =

fibonacci 4 |> should equal 5

[<Test>]

let ``should expect 1, 1, 2, 3, 5 to be the five first fibonnaci numbers`` () =

fibonacciSeq 5 |> should equal [1; 1; 2; 3; 5]

A function will be fully covered very rarely because there are too many execution paths to follow, and it's difficult to follow each and every one of them. The test coverage report will give the developer a false sense of security, and it has an even worse effect in the hands of an ignorant project manager. Test coverage should be considered more harmful than useful.

The one and only benefit of test coverage reports is to analyze where testing has been left out and where to focus testing efforts. However, if such a report is needed, there should be a change in how software development is done to have a more proactive take on quality assurance, instead of building technical debt and analyzing where the issues are in retrospect.

Measuring the delta of code coverage

The natural next step in code coverage is to create a report that will measure how coverage changes from one version of the code to another. There is software that can help you produce such reports in the build server, as shown in the following graph:

Measuring the delta of code coverage

The problem with such a report is that it will be used as a measurement of quality, where it is no such thing. It might provide insights into rising technical debt, but without a trained eye and knowledge of the code that has been committed, the report might cause more harm than good.

There have been talks of development teams that have a requirement to never let coverage percentage (%) decrease. If they commit code that has lower coverage percentage than the previous commit, the build fails until it has been covered by tests.

These is the sort of quality measures that are disruptive to the software development process and cause development to progress much slower. It will not cause quality to rise substantially because programmers will write tests that cover as much code as possible so the build will not fail.

Skip the code coverage tools and reports and put your trust in your methods and processes instead.

Test-driven development

In the beginning there were these brilliant Java guys who wondered what would happen if you were to write code that executed some other piece of code in order to verify whether the original code worked. They created a framework called JUnit that was designed to help others write code to test other code. This idea got very popular and has since forked into many different frameworks on different platforms.

The guys behind this idea were Erich Gamma and Kent Beck, and thought of as leaders, they didn't stop at discovering test automation, but instead asked themselves, "What would happen if we write the test first and the system later?" This invention was labeled TDD and has since been the subject of countless discussions and controversy.

The workflow they invented is shown in the following image:

Test-driven development

The workflow illustrates writing the test first. If the test passes, the system under test already has this functionality implemented. If the test fails, the functionality needs to be implemented into the system. The system should be finalized until all the tests pass. Once they do, the developer may continue by implementing the next test; otherwise, he or she is done implementing the feature.

The point of developing software using the test first approach is that you drive the design of the system from the tests, forcing the system to produce a public API where the tests can hook in to verify. You work in increments in which the system has gradually created unit for unit, where you only implement enough to satisfy the test of the current unit, keeping the design simplistic and non-generalized.

The benefits of TDD is that the developer is forced to stop thinking about what his or her code will output before he or she starts writing it. Instead of having a solution coming from the mind directly to implementation, there is a middle step- the thought process we call design, which is formalized in a test before the implementation of the actual feature. This leads to cleaner code, but also code that is better anchored in the requirements.

Red, green, refactor

It quickly became obvious with TDD that it was hard to maintain a larger perspective of the architecture when working outward with the system, unit for unit. Instead, the systems developed in this way quickly became an entanglement of units, which were all very well designed, but together created a complex graph of dependency. From these problems, the red, green, refactor model was born.

The following image shows the red, green, refactor model:

Red, green, refactor

First, the developer writes a test that will not pass and therefore will be red. Then, the system is implemented in the most trivial way so the test turns green. After the test turns green, the developer moves on to refactoring the code, making sure it fits the rest of the system and the rest of the system fits it. Once refactoring is complete, the developer is either done or continues by implementing the next test to complete the feature.

It is important that the test is red once written. If the test turns green immediately, it indicates that too much has been implemented to make another test green. The effect is that you skip the refactoring step when tests turn green as soon as they are written.

Refactoring is the missing piece of the TDD puzzle, and also the most important one. It is the art of taking a piece of code and improving its design without changing its function. By doing this, the developer will get around the side effects that were associated with TDD in the beginning and get large test-driven systems with good design.

The difficulty, and at the same time, benefit, of TDD is that the developer needs to know beforehand how the feature will be implemented. There is not much room for trial and error, as it is a common practice of developers.

Aversions to test-driven development

A couple of years ago, TDD was very hyped up and strongly evangelized by thought leaders in the software development industry. One of these individuals was Uncle Bob (Robert C. Martin), a strong spokesperson in the .NET community who claimed that 100 percent coverage was the only way to perform TDD.

We have since grown up as an industry and discovered that 100 percent coverage is a waste, as it encourages developers to write terrible tests just to reach the end goal.

The arguments that have sprung up against TDD in the last few years are as follows:

· Project managers mandate developers to perform 100 percent test coverage, which encourages them to write terrible tests just to meet the goal.

· Not all code provide much value in being tested, such as wrappers, facades, or build scripts.

· The test suite also needs development and maintenance. In order to bring more value than cost to a project, it needs to be managed.

· Some tests are better performed manually, when it's easy for a human to check whether the result is plausible, but difficult for a computer.

Your system bloats because of all the code that is needed in order to support your test. There are many test doubles and test interfaces necessary in order to properly call the system under test. This code also needs maintenance and refactoring as the code base grows, and it has the risk of becoming a framework of itself, as well.

A good example of TDD not working is best understood while working with Cascading Style Sheets (CSS). Every rule in CSS is directly connected to a graphical representation on the screen. There is no good way of verifying that the rendering is correct, using test automation.

How do you test CSS first? The nature of working with CSS is that you write a rule, see what happens on the screen, and then adjust until the result matches the design.

It is not sensible to try writing tests first for this layer. The only way to get it right is to use a human to check whether the result looks accurate.

The test-driven development movement started out as a mind hack in order to write software in a different way. Many feel that fundamentalists have taken over with a louder and angrier rhetoric, making other developers feel bad for not testing or not writing tests first. It might have been necessary, however, to beat down the nonbelievers and declare them unprofessional to cause developers to start writing tests. Ultimately, it has divided the developer community into testers and non-testers.

Test first development

At the time when TDD first became all the rage and a source of great frustration and controversy, it became hard to label work that was being done as TDD without implying that tests were written before the code, as in the Beck/Gamma model.

However, most developers that were writing tests were not doing it in a test first manner, but called themselves TDD practitioners anyway. The only way to keep the concepts apart was to start calling it test first and test-last, where the purists claimed the only way of doing it right is to write tests first, and that test-last really has nothing to do with test-driven development.

Pragmatists will always claim that writing tests last has greater value than not writing tests at all, in situations where tests bring more value to the development process than writing the code without.

Many developers have a hard time adjusting to the reversed thought process of testing first that requires you to visualize a fair amount inside your mind before you start writing the first test.

Tip

Don't feel bad for not writing tests first. The purpose is not the process, but the result.

Fixing bugs

The test first approach lends itself very well to fixing bugs in your code. You take advantage of tests being specifications and use this to both verify the bug is there and that it is fixed. This will provide regression, making sure the bug will not reappear.

The following image shows the workflow for fixing bug tests first:

Fixing bugs

The first thing you need to do is add a test to fill the gap provided by the bug. The test should turn red; otherwise, you have not managed to find the bug. After this, you can fix the bug, which will cause the test to turn green and give you an instant verification that the bug is fixed.

Don't forget to refactor the code if needed.

Using this workflow extensively for bugs, I've discovered a few things:

1. It is quite common for you to find out that the system actually works as expected, but that the bug report was wrong.

2. It is also common that you fix a bug, but the test doesn't turn green because you only thought you'd fixed it, when it's actually still there.

3. It is common that you fix the bug only to discover another test turns red. Your fix broke something else that now needs fixing.

4. It sometimes happens that you fix a bug and now another test has turned red, which contradicts the first test. The reported bug was a change of requirements in disguise.

All these things that get unraveled by fixing bug tests first means I will save time for both myself and the tester by iterating fewer times.

API design

Another situation where the test first approach really excels is in Application Program Interface (API) design. By specifying the API by the tests you write, you're actually the first consumer of that API. This will not only create a well-designed API, but also serve as a good example of how the API can be used.

Here's an example of how tests can specify a search API:

open NUnit.Framework

[<Test>]

let ``Search should not accept empty search string`` () =

Assert.Fail()

[<Test>]

let ``Search should return first page as default`` () =

Assert.Fail()

[<Test>]

let ``Search should return 20 items per page as default`` () =

Assert.Fail()

[<Test>]

let ``Search should return all items on 1 page when page size is zero`` () =

Assert.Fail()

I was once working for a client for whom we had created a very nice type ahead for addresses, namely all addresses in Sweden. The client, who had previously been very skeptic about our solution, was overly thrilled to see it in action and wanted it implemented on all their platforms. One of these was a WinForms application maintained by a competitor of ours. This is a classic problem where two vendors competing with each other are forced to collaborate. If anything were to go bad, there would be a lot of blaming, which would not help the client gain anything at all.

In order to keep concerns separated, I was assigned with implementing the API that would service the competitor with the search functionality that was featured in the type ahead.

I started out by gathering the requirements, working out what tests I needed to confirm that the search API was working. I implemented each test one at a time until I had a completely green test suite. I shipped the API documentation together with the tests so our competitor could run the tests and see that it worked. In a situation where one party is very likely to blame the other for any miscommunication, this method actually worked wonders.

Test-driven development is not a matter of principle or doing things the right way. The only right way is to produce as much value as possible with the least amount of cost. Sometimes, this can mean writing tests first and other times, by writing them after the code. In rare cases, it means writing no tests at all. The important part is to recognize what brings value and implement it.

Testing or fact-checking

Upon asking James Bach, one of the most influential people in software testing, what it means to test, you will receive the answer that it means questioning the given product by operating and observing it. This enables you to make informed decisions about the product.

This definition of testing is quite far away from what we've been talking about so far, and this is because James Bach is talking about manual testing. The process of manual testing is a creative process in which you study and research, resulting in advice and reports to drive a decision. This doesn't sound very much like automation.

What we've been talking about so far is to use tests to drive the design of our code and also as specification for the product. This is a very proactive approach where we let testing drive the innovation of the product itself in comparison to manual testing, which is a reaction to the product that has been created.

The reasons for using automated tests are as follows:

· Test automation is performed at the same time as developing the code, whereas manual testing is done after the feature has been developed.

· These automated tests are written by a developer and not a tester, making it a development task and not a testing task.

· The automated tests will return a binary result determining if they pass or not each time code is committed. This result is different from the test report created by a tester.

· The automated tests are written to avoid creating bugs and producing a system with higher quality. The manual tests are performed to find problems with the product in order to iterate and improve it.

There have been discussions in the developer community if what we do can actually be called testing or whether it should it be called something else.

A few years back there was an initiative to rename automated tests as "examples", but it never really caught on. It is not bad to see test automation as an example of how the system works. However, it doesn't fit into the red, green, refactor method.

Going back to James Bach, he prefers to call it software checking, as what we do will only verify facts that are known about the product. Developers that are really into Behavior-driven Design (BDD) are keen on calling tests executable specifications, which is great for a narrow field of the tests that are written.

Even if it doesn't feel quite right, "testing" is the term we're stuck with.

Replacing the tester with automation

It has been a common misunderstanding of the businesses for which I've been consulting that test automation will be able to replace the need of a tester. They seem to think it's a good thing to automate testing so there is no need to hire a tester. From what we've been discussing thus far, this is of course nonsensical as test automation and manual testing are so different, with completely different outcomes and purposes. Test automation will produce a product with fewer bugs, and manual testing will come up with improvement to the product to better solve the problem at hand.

There are situations in which manual testing can reap the benefits of test automation:

· Regression testing, making sure that everything that worked in the previous version is still working, is one area where automation excels. This is also an area where computers are particularly good and where the tester's time is better spent elsewhere.

There are excellent tools that let the tester automate the checking by recording the test and setting a condition on the result. This will reduce the time spent on regression testing and essentially let the tester spend more time focusing on exploratory testing.

· Humans are particularly bad at coming up with good test data. Computers are lousy at determining good test cases, but quite good at generating random test data. When you want to test something with a lot of diverse data, it is good to use a computer to generate that data in order to achieve unpredictability.

Seeing the many job openings for test automation engineers, I'm not quite sure what they do. They are not hired as developers, so they will not write tests as a part of their development process. They are not hired as testers, so their purpose is not to investigate and question the product. It will be interesting to see the progression of test automation in the future.

Summary

Test automation is still something that is new to the software industry. It is a game changer in terms of quality and making software projects predictable. It is a mind-bending practice that is hard for the uninitiated, but is a valuable tool once learned. It does not replace the tester in your team, and it cannot be used to measure developers' productivity.

In this chapter, we touched upon the most common discussions concerning test automation and sorted out what is good practice and what is not. We also touched upon the history of test-driven development, which has gone through all the phases, from discovery to fundamentalism, and has now arrived at some sort of pragmatism. The next chapter will put testing into the agile context and explore about how it fits into the development process.