Test Smells - Testing with F# (2015)

Testing with F# (2015)

Chapter 9. Test Smells

A code smell is a symptom in the code that possibly indicates a deeper problem. It's where we look and think that there must be a better way of doing this. Tests also smell but they smell when they are run. There are different kinds of test smells and I will use the following method on these smells:

· What is it?

· How did it come to be?

· What to do about it?

After reading this chapter, you will be able to identify test smells at an early stage and fight them in order to keep a good and healthy test suite.

Tests that break upon refactoring

One very common test smell when dealing with unit tests is that your tests break when you refactor, even though the functionality stays the same. These kinds of tests are called brittle tests.

There are two kinds of breaks:

· Your test suite doesn't compile after refactoring

· Your test suite still compiles, but no longer turns green

Tests would stop compiling if you change the external API the test is relying on. If that is part of your refactoring, then the compilation error is okay and expected. If you're changing the internal implementation of your functionality, then the test should not be affected and still compile.

A test that fails to compile after refactoring could be a sign of testing in a too granular level and testing implementation details instead of testing the function as a whole.

If your test suite still compiles after refactoring but your test fails, you have the same root problem. You're testing on a too granular abstraction level. This is a common case when using mocking that knows too much about how parts of the system interact with other parts. When that implementation detail changes, the test will break, even though the functionality stays the same.

This is also the reason why many unit test advocates also preach to us to stay away from mocks, as they have us testing things that are too close to implementation-level details.

This kind of problem usually comes from writing the test last, after the implementation, as it is then much easier to write tests that cover the code that was written than writing tests that cover the feature that was developed.

As a developer, you can use several tools in order to avoid this test smell:

· Do TDD and write tests first

· Write your tests as user stories

· Don't use mocking frameworks

Lastly, don't use coverage tools that will make you look at a percentage of the covered number, instead of the testing problem in front of you.

Tests that break occasionally

One of the most common test smells are tests that break from time to time. Sometimes you have a test that breaks once and then never again. Other tests will always fail the first time and then succeed when they're rerun. It often happens that you write a test that will work perfectly now but fail sometime in the future.

When it comes to unit testing, this kind of test smell is quite rare. Since your unit tests should operate on memory only, there are few things that would make the test fail occasionally. One thing would be threading. A race condition is always the kind of test that would break when the CPU cycles don't perform in the expected order. This could cause the test to fail even though it is a unit test.

It is much more common that integration tests have occasional hiccups. There could be a dip in the network during database transaction, or the hard drive that you're writing logs to is full. These are things that we have a hard time to protect ourselves from, as it is not the responsibility of the test to check for these things.

Tests that will always fail the first time and succeed in the second are caused because of bad timing. One common example of this would be tests timing out because the application pool is still restarting after a successful deploy.

Another mistake is to disregard caching, where a test would fail at the first try but succeed in the second because the second time the cache has been filled with everything that was needed for the test to succeed.

In the same manner, a large test suite might create a state that will help other tests to pass or fail. Running a test individually is not the same as running it as part of a test suite, as it is very seldom for each test run to clear all states and run with a clean state. There will always be a state that can make your tests pass or fail.

The danger of this test smell is that we stop trusting the test suite when it fails for no apparent reason. If the test suite is always red, then we will not go to it for a status check whether the system is healthy or not. Then, the test suite stops delivering its promised value.

The following actions can be taken to avoid this test smell:

· Avoid testing threaded code. Test the individual parts in a single threaded state.

· Try isolating the test as best you can by clearing all the states from previous tests.

· Make sure that SUT is prepared when you run your test suite. If you have been deploying code to an Internet Information Services (IIS) website just before, make sure you warm it up.

If a test is failing often because the circumstances around the test causes it to fail, then it would be a better idea to remove that test than letting it devalue your whole test suite.

Your test suite should always be green, with only passing tests. If there is one test that is misbehaving and occasionally turning red, it is better to delete that test. A test suite must be trusted by the developers, and it can't be trusted if it often turns red.

Tests that never break

Another common smell is tests that never break, no matter how much you break the feature. This is quite common to see in unit tests where tests are written after the system under test. The test itself is there to cover some part of the code to bring the coverage up, but it is disconnected from the feature that it is actually covering.

What is common for all these tests is that there is a bug in the test that makes it pass even when the result is wrong. Writing bugs in tests is quite common and should never be underestimated. Another common reason is that the test has been badly named and is not actually asserting the thing it is named after. This will result in green, passing tests when the thing the test was supposed to verify is actually failing.

This is the test that you will find when looking for a bug. The test will claim to cover for the bug you've found, but the test will pass, even though the problem obviously is there. In the end, you will find that the test itself has a bug, and in order to expose it, you must first fix the bug in the test.

The problem with these tests is that they bring no value, as they do not verify that the feature is working. In contrary, the test is dangerous, as it provides a false sense of security that 100 percent coverage would mean the features are 100 percent bug-free.

This is how you can avoid this smell:

· Make sure your test is always red before it's green

· Don't write tests in order to bring up code coverage

· Ensure that you're asserting exactly what you've named the test after

If you find a test that will not fail, you should fix the test so it starts providing value, or you should remove it from the test suite. Otherwise, it is just there for you to brag about how many tests or the amount of code coverage you have. This is just a waste and doesn't bring any value.

Tests that are too complex

If you cannot see what the test is about at a glance, then it is too complex. The first thing you look at is the name of the test, which should tell you what the test is asserting, and the second is the implementation of the test, which should be a few lines of straightforward code.

The signs of a too complex test are as follows:

· Large amount of setup code needed

· Conditional logic, such as if, switch, or null-coalescing operators are used

· Looping constructs, such as while, for, or foreach are used

· The test needs helper functions or types to operate

· It has more than one mock or stub

· It requires mocking and stubbing more than one method or property

· The test doesn't fit the screen without scrolling

When the test fails, the developer looking at the test might not be the same that wrote it from the start. This makes it imperative that the test itself is as straightforward as possible. Even more plausible, the developer looking at the test is not even very familiar with the system, and looking at the tests should help that developer to understand the system under test and not make it more complicated.

The most common reason for ending up with complex tests is that you have a system that is hard to test. If you have the sole responsibility for such systems, then you can only blame yourself for not designing code with testability.

Sometimes it is hard to control if you depend on a framework that in itself is difficult to test. The most common example would be ASP.NET, which has become better with the latest versions but still poses a challenge while dealing with states such as cookies, sessions, and redirects.

There are things we can do in order to produce simpler tests, as follows:

· Use TDD test first. This will let you design for testability by writing the test first.

· Stay away from frameworks that make testing harder.

· SOLID principles will bring you far, but composition over inheritance is crucial to reduce coupling.

· Stop writing the test when it starts to smell and start refactoring the SUT instead.

A test that breaks and is too complex for the developer to fix should be deleted. This means that complex tests devalue your test suite and should be avoided. Once we get practice in testing, writing simple tests will come more naturally to us, and it will become easier to produce good and fresh test suites.

Tests that require excessive setup

A slight indication that your system under test is too complex is that the test setup becomes excessive. The easiest way to identify this is when you have 20 lines of setup code in order to run one line of test code. Some of it you might also need to tear down after the test is complete.

The test itself doesn't have to be very complex, but the complexity of the system under test forces the excessive setup on the test. What often happens is that this setup is duplicated from test to tests, causing a lot of duplicated code. The most commonly seen solution is then to extract the method once the test is set up, but this is really bad. Instead, the problem should be addressed in the system under test.

The excessive setup code is usually as follows:

· Setup of dependencies necessary to execute the test

· Setup of states necessary to execute the test

In most modern systems, there is a dependency injection framework in the middle that handles dependencies through the system. This is because object orientation is flawed in a way that requires the developer to create large object graphs in order to perform a simple thing. These object graphs are created and managed by the dependency injection framework so the developer can request an instance of a specific type in the system and have the framework worry about the dependencies.

When writing unit tests, you often want to exchange the dependencies for your own stubs. Commonly, you want to exchange dependencies that are indirect and a few levels down the sub tree, and not only the direct dependencies you input when creating the class manually.

This leads us to the major parts of the test setup: managing the dependency injection framework to exchange the dependencies that you need and then resetting the Dependency Injection (DI) framework in the tear down, because the DI framework is often a singleton in the system and shared throughout the test suite's execution.

So, what can we do about excessive test setup? We can do the following:

· Reduce the dependency graph in the system under test

· Avoid relying on global states such as singletons or thread states such as session data

· Design your APIs to be easy to set up, and use conventions where all options are not mandatory

· When writing a test that requires excessive setup, stop and refactor SUT

The danger with having tests with lots of setup code is that the tests then become hard to read and there is an extract method on the setup code instead of a method that would help us deal with the problem in the system under test. There is also a potential waste: making a change to the dependencies of the SUT will cause a lot of maintenance in the test suite, as all the test setups would then need to be revised to have the test suite compile again. This will become a large time sink in the project and a major waste.

Developers not writing tests

The largest problem your test suite will have is developers that won't write tests. If there is a test suite and the test is run upon committing new code, then the tests will run when these developers commit code to the repository. However, the new code will not be covered by tests and there is uncertainty as to what will happen if the commit breaks the existing tests. Will the developer fix these tests or leave the test suite red?

The impact of not writing tests will start to devalue the whole test suite as the coverage goes down. If some developers stop updating the test suites, then others will follow, and soon you will have a test suite without any active development. Then, the team will not get full benefits, such as enabling refactoring or feeling safe about new features.

I have experienced development teams disabling a whole test suite because a test was failing. They claimed it was because they needed a green build in order to deploy to production. The reason they were talking to me was that the deploy they did had critical errors in it they didn't know how to fix. My first suggestion to them was to re-enable the test suite and make the tests pass.

The question it boils down to is why aren't these developers writing tests? I did some asking around and boiled it down to these answers:

· Testing is a waste

· Management won't let us

· We don't know where to start

· Our code is too hard to test

· It's a thing about culture

All of these issues come from developers that haven't practiced testing in their day-to-day work. They may have learned about it and know that it is a better way doing software development, but they don't have the extra motivation to start writing tests.

Testing is a waste

There is a certain type of programmer that is reluctant to change, and every proposal of different work methods will result in fierce resistance. The uninformed developer will claim that all time not spent on writing features is a waste. The informed developer will know that spending time on test will cause us to spend even less time on building features, and we will gain from it in the long run by delivering higher quality code that is less buggy.

The only way of getting around this mentality is to pair up and show how easy it is to start writing tests. Once you have some green tests to feel good about, the resistance will eventually die down.

Management won't let us

Project managers have one foot standing in the client's ballpark and the other in development. They need to engage the clients and make promises that they know are not possible and that they will have a hard time keeping them. As project managers are often placed above the team in the hierarchy, the manager will force the team to deliver on their promises. This is one of the most common reasons that managers don't let a team engage in quality measures, as they will always hunt for the next promised deadline.

It's like building a card castle that will eventually fall apart, and this is why we prefer working in agile cross-functional teams.

If you can't fire management for not letting you do your job, then it is time to look for a new one. There are enough good jobs around that it's a shame to stay with a bad one.

We don't know where to start

I've been doing lots of tutoring on test-driven development, and even when coming out from a 2-hour long seminar on TDD, developers will open up a code editor and just stare blankly at the screen for minutes because they can't come up with a good name for their first test. This happens to everyone, and is part of the learning curve in order to become a testing developer.

A good solution I've found is to pair up with developers that are new to testing and help them out with some backseat driving. Just helping them to reason out what is considered a good test name will take them a good way down the path to test automation.

Our code is hard to test

Most systems aren't that resilient to test automation. In almost all situations, you will be able to isolate parts of the system that are hard to test and focus testing on the core functionality, where the impact is the most valuable. If you are new and insecure about testing, then denial is a natural reaction.

The easiest way to get around this is to have the junior testing developer pair up with a senior and sort out the problems by testing a particular system. Most of the time, there will be an easy way to get around the most common obstacles.

It's a thing about culture

What you need is a testing culture, where it is natural for all the developers in the team to engage in testing activities. By having a strong testing culture, there will never be a doubt about what to test.

You can build a testing culture by convincing your team that everything should be tested. Build up the confidence in the team around testing, and then it will start to spread around the company. Other teams will envy your successes and start adopting your methods and processes, and soon you will grow your very own testing culture.

Summary

We have been looking at different test smells that indicate that there is something wrong with your test suite, with your test process, or with the competence or motivation of your testing developers.

The important factor when doing test automation is to never forget what you're doing and why you are doing it. While writing the test number 2,501 in the suite, it can be difficult to remember that the test helps you design your system. It will provide regression so you avoid fixing problems that you've already fixed. Lastly, it will help you gain quality in your application, which reduces the total number of bugs to fix in the first place.

In the next chapter, I will close this book with some thoughts on dos and don'ts when it comes to test automation. This will help you succeed in scaling large test suites.