Testing with F# (2015)
Chapter 5. Integration Testing
If unit testing is a way of driving the design of your code, integration testing is purely focused on verifying that your code is working as expected. In this chapter, we will focus on how to use F# for integration testing, touching on the following subjects:
· Writing good integration tests
· Setting up and tearing down databases
· Speeding up integration testing
· Testing stored procedures
· Testing web services
After reading this chapter, you will know how to produce high-quality integration tests that will help you improve your system's stability and verify the contracts to external systems.
Good integration tests
An integration test is a black box test where we try to verify that the different parts of a system work well together. These parts can be different parts that you've developed yourself or integration of databases and web services.
Where the unit test focuses on testing the isolated unit, the integration test will touch upon a larger amount of code and essentially test the system as a whole. In unit testing, we go to a great extent of removing external dependencies, such as web services, databases, and filesystem, from the testing itself. In integration testing, we work to include these dependencies, which will provide the following effects:
· We'll find where our code crashes because of unexpected results from the external system
· Integration tests are usually slower because they are I/O-bound
· Integration tests are brittle because they depend on the filesystem and network
The risk of external systems is that they provide results that are not expected. These results will not be found in unit testing, as you only test for expected errors and not for unexpected errors. Integration testing gives you the ability to find unexpected errors before you start debugging the code or functionally test it.
A program that is pure, without side effects, is an imaginary thing; however, if it did exist, it would not need any integration tests as all states could be found with unit testing alone. Alas! A pure program will not permit dependency on any external state, such as database, filesystem, or console I/O. But if you have a program that is not pure, you need integration testing to find the culprits in your code, where external dependency provides results for what your code was not written for.
You run your unit tests often, and this is because they are fast. There are even tools such as NCrunch that will automatically run your tests in the background as you're writing code, having updated test results for you without you lifting a finger. This is pretty neat. With integration tests, this is impossible as integration tests are usually very slow. With very slow, we're talking about hundreds of milliseconds. This doesn't sound like much, but if you have 1,000 integration tests running at 100 milliseconds each, it will take 100 seconds to execute the tests, not something that you would want to do often.
This means that integration tests aren't run that often. Maybe they are run before every code check in, if the developer is disciplined. It is run on the build server, but then the faulty code has already been committed to the version history. I have seen integration test suites that took 5 hours to run and were run only nightly. In the morning, the developers would come in and get the results from the nightly build.
This is of course a downside of integration tests.
Another downside is that integration tests are brittle. There is not much we can do about this because it's the nature of these tests:
· They are bound by the filesystem, so they require that a space be left on the device
· They are bound by the network, so they require the Ethernet/Wi-Fi to work
Many integration tests work with state by setting it up before the test runs and tearing it down when the test run is completed. Two such simultaneous test runs will affect each other states' and fail because of it. This will happen often in a team of 10 developers and one integration server. Your integration tests will fail, and it will not be because your code is broken. It is because the circumstances in which your test is run are bad, and if they were good, the tests would pass. With all this in mind, we need to think about what constitutes an integration test and when the value of this test is greater than the cost? How can we write good integration test suites?
Layer-for-layer testing
One way to do integration testing is to approach the problem in a layer-for-layer way. This way, you will try your external dependency at several levels of your system. The effect is that you'll get granular error messages that will be easy to follow up and fix.
The following image illustrates how to write layer-for-layer tests:
The first test will call the Data Access Layer (DAL) directly and find problems with the integration between the data access layer and the database. The next test will run against the Business Logic Layer (BLL) and find problems with the data from the database in the BLL The test on the service facade will find problems that the database may cause in that layer and the test to the web service API, which is the same for that layer.
A novice would point out that only the first test is needed, where testing is done directly on the data access layer, but a seasoned programmer would know that these abstractions are leaky and unexpected null values from the database might just happen to turn up a lot of troubles when serializing the same data to XML in the web service API.
These are the things that layer-for-layer testing will help you to find. The downsides are, of course, that you will pay for your granularity with a lot of redundancy. With redundancy comes high maintenance and slow test suites.
Top down testing
In top down testing, we treat the system as a black box and call only on the external interfaces. We ignore what is inside the box and care only about the result that we will get on our function calls. These calls could be made to a REST service or an assembly, or an HTTP call could be made to a web application. What matters is that we abstract away the internals of the system, and this way, gain in flexibility of that implementation. If the implementation changes, the tests will remain solid, as long as the public interface stays the same. In top down testing we call only the public API of the system:
The problem with this approach is that our tests will very rarely be able to specify what went wrong. Sure, it will say what it expected and what result it got instead; however, the failures of integration testing are unexpected and as such unexpected errors seldom come out through the public interface but are cached somewhere along the way and beautified for the user.
The result is a test suite that is very hard to read when it fails, and most often, you will have to dive into the log files in order to find the original exception that caused the error. This is very time consuming, and errors from these tests will consume much more time in troubleshooting than layer-for-layer tests. However, they will be faster to write and easier to maintain.
External interface testing
The last way to write these tests is to only test the actual external interface to the dependency. If the dependency is a database, we only test the actual query to the database and leave all the other layers alone, trusting our abstractions. This is a very efficient way of providing value to the development process very fast, by quickly ruling out the errors in the data access layer and leaving other errors for different kinds of testing, such as functional testing. External interface testing runs the minimal amount of code needed to test the integration:
The reason to do this is that, it is cheap and has a high payback, but we will miss bugs that we need to be ready to deal with in some other way. Sometimes this is quite enough and provides the amount of quality assurance for the effort we're willing to spend, and at other times, we need to go all the way.
Your first integration test
Writing your first integration test is deceivingly easy. You just call your code without caring about the external dependencies, as we do with unit testing.
We have some code that can get user information from the database by the UserName parameter:
module DataAccess =
open System
open System.Data
open System.Data.Linq
open Microsoft.FSharp.Data.TypeProviders
open Microsoft.FSharp.Linq
type dbSchema = SqlDataConnection<"Data Source=.;Initial Catalog=Chapter05;Integrated Security=SSPI;">
let db = dbSchema.GetDataContext()
// Enable the logging of database activity to the console.
db.DataContext.Log <- System.Console.Out
type User = { ID : int; UserName : string; Email : string }
// get user by name
let getUser name =
query {
for row in db.User do
where (row.UserName = name)
select { ID = row.ID; UserName = row.UserName; Email = row.Email }
} |> Seq.tryFind (fun _ -> true)
We can write a test like this to verify its functionality:
[<Test>]
let ``should return user with email hello@mikaellundin.name when requesting username mikaellundin`` () =
// act
let user = getUser "mikaellundin"
// assert
user |> Option.isSome |> should be True
user.Value.Email |> should equal "hello@mikaellundin.name"
This test will turn green because this user exists in the database. But writing a test that depends on a particular state in the database is a particularly bad idea. At some time in the future, the test will fail because the user is no longer in the database. Instead, we should decouple the integration test from the actual state and set up the state that we need before testing and tear it down right after our test is finished:
[<Test>]
let ``should be able to retrieve user e-mail from database`` () =
let dbUser = new dbSchema.ServiceTypes.User(ID = -1, UserName = "testuser", Email = "test@test.com")
// setup
db.User.InsertOnSubmit(dbUser)
db.DataContext.SubmitChanges()
// act
let user = getUser dbUser.UserName
// assert
user |> Option.isSome |> should be True
user.Value.Email |> should equal dbUser.Email
// teardown
db.User.DeleteOnSubmit(dbUser)
db.DataContext.SubmitChanges()
We're inserting a new user, a test record, and we're assigning a negative Primary Key ID (PKID) to it so it won't conflict with the real data. After inserting it, we run the test and assert it; then, at last, we remove the test record.
This works well in the happy case. When the test fails, however, it will leave a test record in the database, and the next time the test is run, it will be unable to insert a test record, as the test record is already there. As the test is not thread safe, it will easily fail when two developers run the test suite at the same time:
[<Test>]
let ``should be able to get username from database`` () =
let dbUser = new dbSchema.ServiceTypes.User(ID = -1, UserName = "testuser", Email = "test@test.com")
// setup
db.Connection.Open()
let transaction = db.Connection.BeginTransaction(isolationLevel = IsolationLevel.Serializable)
db.DataContext.Transaction <- transaction
db.User.InsertOnSubmit(dbUser)
db.DataContext.SubmitChanges()
try
// act
let user = getUser dbUser.UserName
// assert
user |> Option.isSome |> should be True
user.Value.UserName |> should equal dbUser.UserName
finally
// teardown
transaction.Rollback()
db.Connection.Close()
The code is not that simple anymore, but now the whole integration test is running in a transaction that is rolled back at the very end independent of whether the test succeeds or fails. We have also protected ourselves from concurrent runs by using theIsolationLevel.Serialize interface, which will place a lock on the database table until the test is finished. This means if any other test suite runs at the same time, it will have to wait until this test finishes, avoiding the concurrency issue.
Sadly, not many integration tests in the real world are as simple as this one, as the data is dependent on relations to other records and lookup tables that require you to insert a whole graph of data before being able to do the actual testing.
Setting up and tearing down databases
When you're working in a greenfield project and have a favorable situation of designing a database from the ground up, you have complete control over your database when it comes to integration tests, if you do it correctly.
While working on a greenfield project where database development is part of the project, you should adopt fluent migrations as a part of the development cycle. Writing change scripts for database changes has been around for a very long time, but it was Ruby on Rails that popularized migrations as a part of the software development lifecycle, and this has been adopted by .NET in the FluentMigrator library.
Let's start by adding a reference to the FluentMigrator library in our project:
Now we can write our first migration:
open FluentMigrator
type Email = { FromAddress : string; ToAddress : string; Subject : string; Body : string }
[<Migration(1L)>]
type CreateEmailTable () =
inherit Migration()
let tableName = "Email"
override this.Up () =
ignore <| this.Create.Table(tableName)
.WithColumn("ID").AsInt32().NotNullable().PrimaryKey().Identity()
.WithColumn("FromAddress").AsString().NotNullable()
.WithColumn("ToAddress").AsString().NotNullable()
.WithColumn("Subject").AsString().NotNullable()
.WithColumn("Body").AsString().NotNullable()
ignore <| this.Insert.IntoTable(tableName)
.Row({ FromAddress = "test@test.com";
ToAddress = "hello@mikaellundin.name";
Subject = "Hello";
Body = "World" })
override this.Down () =
ignore <| this.Delete.Table(tableName)
What we see here is an implementation of the Migration abstract class. It overrides two methods Up() and Down(), where the Up() method specifies what will happen when migrating to this migration and the Down() method specifies what happens when demigrating from this migration.
The attribute on the class has a number that defines in what order this migration should be run. If we call the migration framework with the number 1, the table will get created, and if we call the migration framework with the number 0, the table will be deleted.
After working on the project for a while, we will have a complete set of migrations that will successfully build the database from scratch. We can use this when running our integration tests.
Now you can use the FluentMigrator library to create the database from scratch for your test suite:
open FluentMigrator
open FluentMigrator.Runner
open FluentMigrator.Runner.Initialization
open FluentMigrator.Runner.Announcers
// sharing database name between tests
let mutable dbName = ""
type MigrationOptions () =
interface IMigrationProcessorOptions with
member this.PreviewOnly = false
member this.Timeout = 60
member this.ProviderSwitches = ""
let connectionString dbName = sprintf "Data Source=.;Initial Catalog=%s;Integrated Security=SSPI;" dbName
[<TestFixtureSetUp>]
let ``create and migrate database`` () =
// constants
dbName <- sprintf "Chapter05_%s" (System.DateTime.Now.ToString("yyyyMMddHHmm"))
// create database
use connection = new System.Data.SqlClient.SqlConnection(connectionString "master")
use createCommand = new System.Data.SqlClient.SqlCommand("CREATE DATABASE " + dbName, connection)
connection.Open()
createCommand.ExecuteNonQuery() |> ignore
// build database from migrations
let announcer = new TextWriterAnnouncer(System.Diagnostics.Debug.WriteLine)
let assembly = System.Reflection.Assembly.GetExecutingAssembly()
let migrationContext = new RunnerContext(announcer)
migrationContext.Namespace <- "chapter05"
let options = new MigrationOptions()
let factory = new FluentMigrator.Runner.Processors.SqlServer.SqlServer2012ProcessorF actory()
let processor = factory.Create((connectionString dbName), announcer, options)
let runner = new MigrationRunner(assembly, migrationContext, processor);
runner.MigrateUp(true);
We create a test fixture setup that will run before any of our tests. During development, it is common to share a database between developers to reduce the amount of duplicate work, but when testing, we don't want to run the tests in that same database. Instead, we should have a fixture setup to create a local database on our own machine and build it up using the migrations. This way, we will have a completely fresh database where tests can be run in isolation:
[<TestFixtureTearDown>]
let ``drop the database`` () =
// drop database
use connection = new System.Data.SqlClient.SqlConnection(connectionString "master")
use dropCommand = new System.Data.SqlClient.SqlCommand("DROP DATABASE " + dbName, connection)
dropCommand.ExecuteNonQuery() |> ignore
The same way, we will have a fixture teardown function that will remove the database when testing is done. We don't want a lot of test databases to hang around on our machine:
open System.Data
open System.Data.Linq
open Microsoft.FSharp.Data.TypeProviders
open Microsoft.FSharp.Linq
type dbSchema = SqlDataConnection<"Data Source=.;Initial Catalog=CustomerRelationsDB;Integrated Security=SSPI;">
let createdb () = let db = dbSchema.GetDataContext(connectionString dbName)
db.DataContext.Log <- System.Console.Out
db
let insert (email : Email) =
let db = createdb()
db.Email.InsertOnSubmit(new dbSchema.ServiceTypes.Email(FromAddress = email.FromAddress, ToAddress = email.ToAddress, Subject = email.Subject, Body = email.Body))
db.DataContext.SubmitChanges()
let getFrom fromAddress =
let db = createdb()
query {
for row in db.Email do
where (row.FromAddress = fromAddress)
select { FromAddress = row.FromAddress; ToAddress = row.ToAddress; Subject = row.Subject; Body = row.Body }
}
The important part of SUT is that we can exchange the connection string easily to use the database we have built up for testing. In this code, we can insert new e-mail messages into a database table and search for messages by FromAddress attribute:
[<Test>]
let ``can insert into email table`` () =
// arrange
let email = { FromAddress = "my@test.com"; ToAddress = "hello@mikaellundin.name"; Subject = "Test"; Body = "Will be queued for sending" }
// act
insert email
// assert
let daEmail = getFrom email.FromAddress |> Seq.nth(0)
daEmail.FromAddress |> should equal email.FromAddress
daEmail.ToAddress |> should equal email.ToAddress
daEmail.Subject |> should equal email.Subject
daEmail.Body |> should equal email.Body
The test is now as naive as we first started out. It can focus on just being a test and verify that the integration with the database works. The setup of the test database and isolation of test data has been taken care of in the test fixture setup and teardown processes.
Even though this procedure seems very costly, running the whole test on my developer machine takes no more than 60 ms. Performance gains come from not having to make calls over the network, but relying on the internal communication on the machine.
The benefits of this method is that you get a completely version-controlled database with migrations that are properly tested before going into production. It moves the development of the database from the Database Administrator (DBA) to the application developer, and it allows the developer to test in isolation on his or her own machine. When running tests, there will be no conflict between two developers working together on the same project, as they will run their tests in separate environments. Also, these tests will run quite fast, as they don't have to run over the network.
However, we are dependent on a greenfield database setup, and it only scales up to a level of data and complexity. Once you move into brownfield development, a different set of tools are needed to handle setup and teardown.
Brownfield database setup
Many times when working with databases, we inherit a database that we need to integrate with. We don't have the migration scripts to build the database, and it would be too time consuming to create them in retrospect. What we can do instead is take a backup of the production database and restore it prior to testing.
In order to do this, you need to reference the Microsoft.SqlServer.Smo and Microsoft.SqlServer.SmoExtended assemblies. They should be in the Global Assembly Cache (GAC) if you have a SQL Server installed locally:
Once we have these references available, we can write a test fixture setup that will restore a database backup prior to our testing using the SMO framework:
open NUnit.Framework
open FsUnit
open Microsoft.SqlServer.Management.Smo
let dbFilePath = @"C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\"
let dbName = sprintf "Chapter05_%s" (System.DateTime.Now.ToString("yyyyMMddHHmm"))
[<TestFixtureSetUp>]
let ``Setup database`` () : unit =
let server = new Server(".")
server.ConnectionContext.LoginSecure <- true
server.ConnectionContext.Connect()
try
let restore = new Restore()
restore.Database <- dbName
restore.Action <- RestoreActionType.Database
restore.Devices.AddDevice(@"C:\chapter05.bak", DeviceType.File)
restore.ReplaceDatabase <- true
restore.NoRecovery <- false
restore.RelocateFiles.Add(new RelocateFile("Chapter05_201410302303", dbFilePath + dbName + "_Data.mdf")) |> ignore
restore.RelocateFiles.Add(new RelocateFile("Chapter05_201410302303_log", dbFilePath + dbName + "_Log.ldf")) |> ignore
restore.SqlRestore(server)
finally
server.ConnectionContext.Disconnect()
This code will open up a connection to the local Microsoft SQL Server and restore a database backup file from the disk. The same way, we can use the SMO framework to delete the database when the test is done:
[<TestFixtureTearDown>]
let ``Tear down database`` () : unit =
let server = new Server(".")
server.ConnectionContext.LoginSecure <- true
server.ConnectionContext.Connect()
try
server.KillDatabase(dbName)
finally
server.ConnectionContext.Disconnect()
In my simple database test, it takes no more than 60 ms to perform this, and it will be much more performance efficient than building the migration script history as the database and its complexity grows.
However, this solution is not as clean as the migration setup, as you will bring a backed up state into the test database. This can be good for finding problems when using real data, or bad, if your tests start depending on data that is brought in with the backup.
Really large databases
The tips in the preceding section will work great for small databases. Actually, it will work great for most databases, as you can always strip out the data and just use the schema, which takes very little space at all. Your tests should not depend on the data in the database but only test the integration with the database. For this, you don't need a full set of production data to go along with it.
However, there are situations when it's not probable to work with a stripped-down version of a database, and you need another tactic.
I was working for a client that had a reasonably large database of 50 GB of data. We were doing a heavy amount of testing and the problem we had was to keep an updated version of this data and be able to run integration tests in isolation, as they would fail from time to time when two test suites were run at the same time.
A part of this database was the Swedish national repository for home addresses of all the 10 million residents. We could have cleared the tables of all its data but this would change the prerequisite for the system under test. The database wasn't behaving the same way, with 5 rows of test data compared to 2 million rows of real data, so we decided to leave it in there.
This is not a database that you can easily restore every time you want to run your test suite. Also, the relation tree in the database was quite extensive and you wouldn't want to insert complete sets of test data by code. Instead, we created a stored procedure in the database to set up our test model:
CREATE PROCEDURE SetupTestData
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- local variables
DECLARE @PersonID1 int,
@PersonID2 int,
@PersonID3 int,
@AddressID1 int,
@AddressID2 int;
-- Create test persons
INSERT INTO dbo.Person (SSN, FirstName, GivenName, LastName, BirthDate)
VALUES ('193808209005', 'Ingela Ping', 'Ingela', 'Forsman', '1938-08-20');
SET @PersonID1 = SCOPE_IDENTITY();
INSERT INTO dbo.Person (SSN, FirstName, GivenName, LastName, BirthDate)
VALUES ('194212259005', 'Inge Pong', 'Inge', 'Forsman', '1942-12-25');
SET @PersonID2 = SCOPE_IDENTITY();
INSERT INTO dbo.Person (SSN, FirstName, GivenName, LastName, BirthDate)
VALUES ('196403063374', 'Jesper', 'Jesper', 'Forsman', '1964-03-06');
SET @PersonID3 = SCOPE_IDENTITY();
-- ... more test persons
-- Create test addresses
INSERT INTO dbo.[Address] (StreetName, StreetNumber, PostalCode, Town)
VALUES ('Lantgatan', '38', '12559', 'SOLNA');
SET @AddressID1 = SCOPE_IDENTITY();
INSERT INTO dbo.[Address] (StreetName, StreetNumber, StreetLetter, ApartmentNumber, [Floor], PostalCode, Town)
VALUES ('Stångmästarevägen', '11-13', 'A', '1201', '2', '15955', 'STOCKHOLM');
SET @AddressID2 = SCOPE_IDENTITY();
-- ... more test addresses
-- Create relations
INSERT INTO dbo.NationalRegistrationAddress (Person, [Address], [Date])
VALUES (@PersonID1, @AddressID1, '1970-01-01');
INSERT INTO dbo.NationalRegistrationAddress (Person, [Address], [Date])
VALUES (@PersonID2, @AddressID1, '1970-01-01');
INSERT INTO dbo.NationalRegistrationAddress (Person, [Address], [Date])
VALUES (@PersonID3, @AddressID1, '1970-01-01');
INSERT INTO dbo.NationalRegistrationAddress (Person, [Address], [Date])
VALUES (@PersonID3, @AddressID2, '1982-03-06');
-- ...more test relations
END
GO
Now, we treat the database and all its data as the ground state, and we will not touch or query on that. For all the test data that we need, we add it to this stored procedure. This, however, provides us with a couple of problems:
· You want to insert this data for every test run, but once it is run, the test data will always be there
· It's not isolated; another test can appear and modify this set of data
To solve this, we go back to the beginning of this chapter and look at how to run the setup in a transaction that we will then roll back.
If we have the following system under test:
type dbSchema = SqlDataConnection<"Data Source=.;Initial Catalog=Chapter05;Integrated Security=SSPI;", StoredProcedures = true>
let db = dbSchema.GetDataContext()
// Enable the logging of database activity to the console.
db.DataContext.Log <- System.Console.Out
// get address history of person with ssn number
let getAddressHistory ssn =
query {
for addressHistory in db.NationalRegistrationAddress do
join person in db.Person on (addressHistory.Person = person.ID)
join address in db.Address on (addressHistory.Address = address.ID)
where (person.SSN = ssn)
select address
}
This code will return all the addresses that a person has been living at using their Social Security Number (SSN). We do this by simply joining the many-to-many relationship together and querying it on the SSN. This should be no trouble for the database as long as the SSN is indexed properly.
Now, let's write a test that looks as follows:
[<Test>]
let ``should get address history from SSN number`` () =
let ssn = "196403063374"
// setup
db.Connection.Open()
let transaction = db.Connection.BeginTransaction(isolationLevel = IsolationLevel.Serializable)
db.DataContext.Transaction <- transaction
db.SetupTestData() |> ignore // <-- here the db is prepped with test data
try
// act
let addresses = getAddressHistory ssn |> Seq.toList
// assert
addresses.Length |> should equal 2
finally
// teardown
transaction.Rollback()
db.Connection.Close()
Here, we made assumptions on the data in the database, namely on the test data in the stored procedure. It would be optimal to create a module for test data that we could refer to instead of writing out the test data directly in our test, as the data will be under high reuse.
What's happening here is that the test data is inserted by calling the stored procedure, but only in this transaction. The test is run on the transaction, and when the test completes, the transaction is rolled back and no state is changed on the actual tables. This also makes sure that no other test runner can see the data that we're operating on and there will be no conflicts. There might be a performance hit because of the chosen locking mechanism, and you should never try this on a production database.
What we managed to do here is create a sensible integration testing strategy for heavy databases without the problems of persisting state or conflicting test runs. We've found that integration testing looks much easier than it actually is. However, the complexity of an integration test can be mitigated, and in the end, these tests will bring us a lot of value in the long run.
Speeding up integration testing
Integration tests are slow. This is not noticeable when you are at the beginning of the product life cycle and have 50 integration tests, but when you start closing on 1,000 integration tests, they will take a long time to run. This is, of course, because they are I/O-bound, waiting for network traffic.
A unit test should never take more than 10 ms to complete. This means that 1,000 unit tests will run under 10 seconds. This is enough time to make you loose focus but not enough to be called a major hurdle in your work process.
On the other hand, it is not uncommon for an integration test to take 150 ms to complete. If you have 1,000 integration tests, it will take 2–5 minutes to complete the test run. Now, this is a bit optimistic. I had an integration test suite of 850 integration tests running in 18 minutes. Why is this important?
The time it takes to run a test suite will decide whether it will be run or not. When a test suite takes a long time to run, these tests will only be run in the build server where the developer doesn't have to sit around looking at the screen.
As developers aren't running their tests before committing to source control, this means that faulty code will be committed more often and the source control will be in a faulty state. Slow tests are bad and fast tests are good.
Testing in parallel
The most common problem is that our tests are run sequentially and our integration tests spend most of the time waiting on the network.
A query is sent to the database. The test sits around waiting for the database to return a result before it can assert on the expectations. This is a waste and we could definitely speed it up by running tests asynchronously; however, there is no framework available currently that will allow it. Instead, we need to look at running our tests in parallel.
There aren't many frameworks that will allow you to do this. We need to choose between the discontinued MbUnit or version 3 of NUnit, which as of this writing only exists in the alpha release. The FAKE build tool also contains helpers to execute test suites in parallel, but with this, it will also be dependent on tests in different assemblies. It will not parallelize tests in the same assembly.
I chose to write this chapter with the alpha release, looking forward to it being production-ready by the time this book is published.
Let's reuse our previous example with the address register and extend it a little bit:
type dbSchema = SqlDataConnection<"Data Source=.;Initial Catalog=Chapter05;Integrated Security=SSPI;", StoredProcedures = true>
// get address history of person with ssn number
let getAddressHistory ssn (db : dbSchema.ServiceTypes.SimpleDataContextTypes.Chapter05) =
query {
for addressHistory in db.NationalRegistrationAddress do
join person in db.Person on (addressHistory.Person = person.ID)
join address in db.Address on (addressHistory.Address = address.ID)
where (person.SSN = ssn)
select address
}
let getHabitantsAtAddress streetName streetNumber streetLetter postalCode (db : dbSchema.ServiceTypes.SimpleDataContextTypes.Chapter05) =
query {
for addressHistory in db.NationalRegistrationAddress do
join person in db.Person on (addressHistory.Person = person.ID)
join address in db.Address on (addressHistory.Address = address.ID)
where (address.StreetName = streetName &&
address.StreetNumber = streetNumber &&
address.StreetLetter = streetLetter &&
address.PostalCode = postalCode)
select person
}
For this, I've written five tests with the following names:
· Should get address history from the SSN number
· Should return empty address history when not found
· Should get all the habitants of the address
· Should get the only habitant of an address
· Should get an empty result of habitants when the address doesn't exist
We're not going into the details of these tests, as I have already described the method, and these are very similar.
When we run these tests in the command line, we will get the following output:
Test Run Summary
Overall result: Passed
Tests run: 5, Errors: 0, Failures: 0, Inconclusive: 0
Not run: 0, Invalid: 0, Ignored: 0, Skipped: 0
Duration: 3.389 seconds
These five tests in total take about 3–5 seconds to run. What we can do in this version of NUnit is decorate our tests with the Parallelizable attribute. This will enable our tests to run in parallel:
[<Test>]
[<Parallelizable(ParallelScope.Self)>]
let ``should get the only habitant of an address`` () =
let db = dbSchema.GetDataContext()
db.DataContext.Log <- System.Console.Out
let streetName, streetNumber, streetLetter, postalCode =
("Stångmästarevägen", "11-13", "A", "15955")
// setup
db.Connection.Open()
let transaction = db.Connection.BeginTransaction(isolationLevel = IsolationLevel.ReadCommitted)
db.DataContext.Transaction <- transaction
db.SetupTestData() |> ignore // <-- here the db is prepped with test data
try
// act
let persons = getHabitantsAtAddress streetName streetNumber streetLetter postalCode db
|> Seq.toList
|> List.map (fun person -> person.SSN)
// assert
Assert.That(persons, Is.EqualTo(["196403063374"]))
finally
// teardown
transaction.Rollback()
db.Connection.Close()
We can also specify the level of parallelization, which means specifying how many threads will run our test by the following assembly attribute:
module AssemblyInfo =
open NUnit.Framework
[<assembly: LevelOfParallelization(5)>]
do ()
We will see the result when we run the test suite again:
Test Run Summary
Overall result: Passed
Tests run: 5, Errors: 0, Failures: 0, Inconclusive: 0
Not run: 0, Invalid: 0, Ignored: 0, Skipped: 0
Start time: 2014-11-04 22:42:41Z
End time: 2014-11-04 22:42:44Z
Duration: 2.790 seconds
The larger the number of tests we have, the larger will be the gain from parallelizing our tests. You will have to make some considerations in case the underlying systems that you're integrating with is serializing your transactions. In this case, the performance might not be gained from parallelizing; you will actually end up with performance that is worse than before.
Testing stored procedures
When talking to developers, many fear and loath the database-stored procedure because they have been told to do so. If you weren't a developer 10 years ago, you won't know where the pain came from. It's not because of the Transact SQL (T-SQL) language, which many think is the reason. It's not because of badly written stored procedures because anyone can write bad code in any language. It has to do with DBAs.
The database administrator is the person assigned to be responsible for the database. With this responsibility comes the operational tasks of setting up new databases and validating backups and monitoring them.
A DBA is the owner of the database domain, and as such, he or she wouldn't want developers creating tables, indexes, and such. Instead, the DBA will provide the tables necessary for the developer to store their data. The problem here is that application development and data persistence go hand in hand and need to be done in the same development cycle. However, 10 years ago, database was considered a service, and as such, DBAs provided an API called the stored procedures. This API would let the DBA force constraints on the callee and fine-tune permissions at a detailed level. The stored procedure API would entrench the DBA's importance and cause data persistence to become a drag.
The main problem with these stored procedure APIs were that they were an unnecessary abstraction that would only get in the way of application development and be there to create an entrenchment around the DBA's position.
The second problem was that DBAs were seldom programmers themselves and not really fit to produce a well-crafted API. Working with these stored procedures could be challenging at best, and opening up and looking at the implementation would scare most developers dumb.
I think that this structure was hurtful for the software industry, where DBA's evangelized relational databases to be the only data store so they could keep their domain to themselves. When this started to loosen up, other data persistence alternatives quickly came to life. Today, we only see DBAs in large enterprise organizations. In all other cases, operations has taken over the operational tasks of the database and developers happily connect directly to tables without passing through unnecessary layers of abstractions.
Despite its history, the stored procedure is quite a powerful construct to simplify certain tasks within the database, as long as you don't use it to hide business rules that should be expressed in the code.
This example will mimic a simple database table structure for a CMS system:
A page in the system, not presented here, has a page type. This page type has properties, and each property is of a specific property type.
Let's populate this table structure with some stub data:
[<SetUp>]
let ``insert stub data into cms`` () : unit =
let db = dbSchema.GetDataContext()
// truncate the tables
db.Truncate() |> ignore
// create page type
let contentPage = new dbSchema.ServiceTypes.PageType(Name = "ContentPage")
// create property types
let stringPropertyType = new dbSchema.ServiceTypes.PropertyType(Name = "PropertyString")
let booleanPropertyType = new dbSchema.ServiceTypes.PropertyType(Name = "PropertyBoolean")
let htmlPropertyType = new dbSchema.ServiceTypes.PropertyType(Name = "PropertyHtml")
// create properties for content page
let pageNameProperty = new dbSchema.ServiceTypes.Property(Name = "PageName", PageType = contentPage, PropertyType = stringPropertyType)
let visibleInMenuProperty = new dbSchema.ServiceTypes.Property(Name = "VisibleInMenu", PageType = contentPage, PropertyType = booleanPropertyType)
let mainBodyProperty = new dbSchema.ServiceTypes.Property(Name = "MainBody", PageType = contentPage, PropertyType = htmlPropertyType)
// insert
db.Property.InsertAllOnSubmit [pageNameProperty; visibleInMenuProperty; mainBodyProperty]
db.DataContext.SubmitChanges()
As you may have noted, the data inserted was not done in a transaction. Instead, we inserted the data at every test fixture run. In order to do this, we need to truncate the tables in between.
Having the data set up in code might be beneficial if we can make it available to tests that are going to operate on it.
However, we need a truncate method, which will remove all the data. This is best done as a stored procedure, so we keep the knowledge of the tables in the same place. The stored procedure to truncate the tables is very simple:
CREATE PROCEDURE [Truncate] AS
BEGIN
DELETE FROM [Property]
DELETE FROM [PropertyType]
DELETE FROM [PageType]
END
GO
I would call this during the setup of the test and not the teardown, as it could be interesting to look at the database state's post-mortem after a test has failed. If we clear the database after a test is run, we will not have this possibility.
Now we can write a test that verifies the stored procedure works:
[<Test>]
let ``truncate should clear all tables`` () =
// arrange
let db = dbSchema.GetDataContext()
// act
db.Truncate() |> ignore
// assert
let properties = query { for property in db.Property do select property } |> Seq.toList
properties.IsEmpty |> should be True
let propertyTypes = query { for propertyType in db.PropertyType do select propertyType} |> Seq.toList
propertyTypes.IsEmpty |> should be True
let pageTypes = query { for pageType in db.PageType do select pageType } |> Seq.toList
pageTypes.IsEmpty |> should be True
The interesting thing here is that the F# SQL-type provider will generate a function where we can easily call the stored procedure and test it out. We do not have to write any stub code for this, which makes it so powerful.
Let's try a less complicated example and introduce the Page and PropertyValue tables, which contain the pages that are going to get rendered:
This data model is normalized and a bit hard to work with. Pulling out all pages of a specific type from the data store is not very pleasant and requires a lot of joins. What we can do is define a stored procedure that will help us deliver a denormalized data model:
CREATE PROCEDURE GetPagesOfPageType
@PageType varchar(50)
AS
BEGIN
SET NOCOUNT ON;
SET FMTONLY OFF;
DECLARE @Properties varchar(MAX),
@Query AS NVARCHAR(MAX)
-- extract columns as comma separated string from page type
SELECT @Properties = STUFF(
(SELECT ',' + property.[Name]
FROM [Property] property
INNER JOIN [PageType] pageType ON property.PageType_FK = pageType.ID
WHERE pageType.Name = @PageType
FOR XML PATH('')), 1, 1, '')
SET @Query = N'SELECT pageID, ' + @Properties + N' FROM
(
SELECT [page].ID as pageID,
property.[Name] as name,
propertyValue.[Value] as value
FROM [Property] property
INNER JOIN [PageType] pageType ON property.PageType_FK = pageType.ID
INNER JOIN [Page] [page] ON [page].PageType_FK = pageType.ID
INNER JOIN [PropertyValue] propertyValue ON propertyValue.Property_FK = property.ID AND propertyValue.Page_FK = [page].ID
WHERE pageType.Name = ''' + @PageType + N'''
) x
PIVOT
(
max(value)
FOR name IN (' + @Properties + N')
) p'
exec sp_executesql @query
END
This stored procedure pivots the data in the columns, so instead of getting a data model that is hard to work with, you will get a result table that is much more normalized and maps better to a model object in the code, as shown in the following image:
Because this stored procedure returns a dynamic SQL query, the type provider will not be able to pick up the schema automatically. We will need to implement a simple Object Relational Mapper (OR mapper) for this:
type ContentPage = { pageID : int; PageName : string; VisibleInMenu : bool; MainBody : string }
// map dataRecord to pageType
let convert<'pageType> (dataRecord : IDataRecord) =
let pageType = typeof<'pageType>
let values = FSharpType.GetRecordFields(pageType)
|> Array.map (fun field -> Convert.ChangeType(dataRecord.[field.Name], field.PropertyType))
FSharpValue.MakeRecord(pageType, values) :?> 'pageType
// get all pages of type
let getPagesOfType<'pageType> (db : dbSchema.ServiceTypes.SimpleDataContextTypes.Chapter05) =
seq {
let command = db.Connection.CreateCommand()
command.CommandText <- "GetPagesOfPageType"
command.CommandType <- CommandType.StoredProcedure
let pageTypeParameter = new System.Data.SqlClient.SqlParameter("PageType", typeof<'pageType>.Name)
command.Parameters.Add(pageTypeParameter) |> ignore
db.Connection.Open()
try
use reader = command.ExecuteReader()
while reader.Read() do
yield (reader :> IDataRecord) |> convert<'pageType>
finally
db.Connection.Close()
}
Here I defined a record type that will act as my page type. I will then define a function called getPagesOfType that will retrieve the data as seen in the preceding code and then map that to the record type.
In order to verify that it works, we need to append some data to our setup function:
// create page function
let createPage created author pageType =
new dbSchema.ServiceTypes.Page(Created = created, Author = author, PageType = pageType)
// create property on page
type dbSchema.ServiceTypes.Page with
member this.addPropertyValue propertyDefinition value =
this.PropertyValue.Add(new dbSchema.ServiceTypes.PropertyValue(Value = value, Page = this, Property = propertyDefinition)) |> ignore
this
// create pages
let startPage =
(((createPage DateTime.Now "Mikael Lundin" contentPage)
.addPropertyValue pageNameProperty "Home")
.addPropertyValue visibleInMenuProperty "true")
.addPropertyValue mainBodyProperty "Welcome to my homepage"
let aboutPage =
(((createPage DateTime.Now "Mikael Lundin" contentPage)
.addPropertyValue pageNameProperty "About Me")
.addPropertyValue visibleInMenuProperty "true")
.addPropertyValue mainBodyProperty "I am a software developer"
let servicesPage =
(((createPage DateTime.Now "Mikael Lundin" contentPage)
.addPropertyValue pageNameProperty "My Services")
.addPropertyValue visibleInMenuProperty "true")
.addPropertyValue mainBodyProperty "I build high-quality software in F#"
// insert
db.Page.InsertAllOnSubmit [startPage; aboutPage; servicesPage]
db.DataContext.SubmitChanges()
Here we used the new definitions and added our pages and properties on top of them. Now we're actually ready to see whether this works. Here is our test function:
[<Test>]
let ``get all pages of a page type`` () =
// arrange
let db = dbSchema.GetDataContext()
// act
let page1 :: page2 :: page3 :: [] = getPagesOfType<ContentPage>(db) |> Seq.toList
// assert
page1.PageName |> should equal "Home"
page1.VisibleInMenu |> should equal true
page1.MainBody |> should equal "Welcome to my homepage"
page2.PageName |> should equal "About me"
page2.VisibleInMenu |> should equal true
page2.MainBody |> should equal "I am a software developer"
page3.PageName |> should equal "My Services"
page3.VisibleInMenu |> should equal true
page3.MainBody |> should equal "I build high-quality software in F#"
I just love how easy this reads; it hardly needs any explaining at all. In the end, we managed to use F# to test stored procedures, both with the built-in SQL type provider and by calling the stored procedure and mapping it up manually. It is very easy, and there is really no excuse for not testing your database properly.
Data-driven testing
One important aspect when doing integration testing is to be able to test with a diversity of data. This might at times render a lot of repetition, for example, we might want to run the same exact test but with other input values. Instead of writing a unique test for each and every test case, you can write a general test and supply the data for the test separately.
In order to demonstrate this, go back to our address register and query it for addresses. This is what our SUT will look like:
type dbSchema = SqlDataConnection<"Data Source=.;Initial Catalog=Chapter05;Integrated Security=SSPI;">
// find an address by a search string
let searchAddress q (db : dbSchema.ServiceTypes.SimpleDataContextTypes.Chapter05) =
query {
for address in db.Address do
where ((address.StreetName.StartsWith q) ||
(address.StreetNumber = q) ||
(address.PostalCode.StartsWith q))
select address
}
This function will allow you to search for an address on three different parameters: street name, street number, and postal code. Here, street number must be an exact match, but it's fine for the street name and postal code to match partially.
This requires some extensive testing. Instead of writing all these tests, we can provide one test function with input arguments.
In this example, we supply two arguments to the test function. First is the query that will run and the second is a Boolean argument that specifies whether we can expect the query to find our test address or not:
[<TestCase("Lant", true)>] // match start of street name
[<TestCase("gatan", false)>] // not matching end of street name
[<TestCase("L", true)>] // matching single letter
[<TestCase("38", true)>] // matching whole street number
[<TestCase("3", false)>] // not matching part of street number
[<TestCase("12559", true)>] // matching whole postal code
[<TestCase("125", true)>] // matching start of postal code
[<TestCase("59", false)>] // not matching end of postal code
let ``should query address register`` q expectedFind =
// setup
let db = dbSchema.GetDataContext()
db.Connection.Open()
let transaction = db.Connection.BeginTransaction(isolationLevel = IsolationLevel.Serializable)
db.DataContext.Transaction <- transaction
db.SetupTestData() |> ignore // <-- here the db is prepped with test data
try
// act
let addresses = searchAddress q db |> Seq.toList
// assert
let found = addresses
|> Seq.exists (fun address ->
(address.StreetName = "Lantgatan") &&
(address.StreetNumber = "38") &&
(address.PostalCode = "12559"))
found |> should equal expectedFind
finally
// teardown
transaction.Rollback()
db.Connection.Close()
This kind of testing is very suitable for certain kinds of problems where you have the same function that you want to try with a range of data.
Testing web services
In this chapter, I have been talking exclusively about databases because database is the most common part of a system where you would want to perform the integration test. The database is often part of your system design and a unit in the solution architecture. Web services are in our architecture just as often internal to our systems as hosted externally by a third party.
This leads us to the question: should we test external web services?
Most of the time, you should not care about testing external services. If there is a well-defined API that feels stable, it is not your responsibility to cover it with tests.
However, if the external API is developed as a part of your solution, there could be a huge benefit for you to write a set of integration tests that will define the contract of what your application is expecting from the API and hand it over to a third party. They can use these tests to verify that your client will work together with their implementation. This way of integrating between teams is extremely powerful.
Another reason for writing tests that hit an external service is to discover the service, figure out how it works, and note the quirks it has before implementing your client. These tests should be inactivated once the client is implemented, as the discovery phase is over and they do not bring value anymore.
You should be really careful when writing integration tests that will hit external services. Always strive for running your tests against a test web service at the receiving end so you don't cause side effects on the production system.
It is also quite common with external web services that they're charged on every API call you make. This could make automation an expensive story, and I would suggest that you really get to bottom of it before you start writing automated tests for external services.
I recommend that you follow the advice of vertical slice testing and implement the external service as an in-memory representation so you can choose not to care about external dependency when you want to test other things. These are the kinds of test harnesses that really pay off once they are in place.
Web service type provider
When working with testing SOAP web services from C#, you need to generate code by creating a client through Visual Studio. In F#, none of this is necessary.
Let's say that we create the following web service:
[ServiceContract]
public interface IValidationService
{
[OperationContract]
ValidationResult ValidateEmail(string input);
}
[DataContract]
public class ValidationResult
{
public ValidationResult()
{
}
public ValidationResult(string identifier)
{
Identifier = identifier;
}
[DataMember]
public string Identifier { get; set; }
[DataMember]
public ValidationStatus Status { get; set; }
[DataMember]
public string ErrorMessage { get; set; }
}
[DataContract]
public enum ValidationStatus
{
[EnumMember]
None = 0,
[EnumMember]
Valid = 1,
[EnumMember]
Invalid = 2,
[EnumMember]
Error = 3,
}
This is implemented with the following implementation:
public class ValidationService : IValidationService
{
const string EmailValidation = "EmailValidation";
const string EmailExpression = @"^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$";
public ValidationResult ValidateEmail(string input)
{
var result = new ValidationResult(EmailValidation);
try
{
var expression = new Regex(EmailExpression, RegexOptions.IgnoreCase);
if (expression.IsMatch(input))
{
result.Status = ValidationStatus.Valid;
}
else
{
result.Status = ValidationStatus.Invalid;
}
}
catch (Exception ex)
{
result.Status = ValidationStatus.Error;
result.ErrorMessage = ex.Message;
}
return result;
}
}
When we start the implementation, we will generate a Web Services Description Language (WSDL) contract on http://localhost:53076/ValidationService.svc?wsdl. The port number might vary from machine to machine. This is also what we will use in order to generate a type provider for F#:
type ValidationService = WsdlService<"http://localhost:53076/ValidationService.svc?wsdl">
// mapping the generated classes into a discriminated union
type ValidationStatus =
| Valid
| Invalid
| Error of message : string
static member Create (input : ValidationService.ServiceTypes.chapter07.code.service.ValidationRe sult) =
match input.Status with
| ValidationService.ServiceTypes.chapter05.code.service.ValidationSt atus.Valid -> Valid
| ValidationService.ServiceTypes.chapter05.code.service.ValidationSt atus.Invalid -> Invalid
| ValidationService.ServiceTypes.chapter05.code.service.ValidationSt atus.Error -> Error(input.ErrorMessage)
| _ -> failwith "Unknown validation status"
// wrapper that provides a functional interface
let validateEmail (service : ValidationService.ServiceTypes.SimpleDataContextTypes.ValidationSe rviceClient) input =
let result = service.ValidateEmail(input)
match result |> ValidationStatus.Create with
| Valid -> true
| Invalid -> false
| Error message -> failwith ("Validating e-mail cause following error: " + message)
Let's write tests that will verify that our validation service works:
[<TestCase("hello@mikaellundin.name")>]
[<TestCase("mikael.lundin@litemedia.se")>]
[<TestCase("mikael.lundin@valtech.se")>]
[<TestCase("mikael.lundin@litemedia.info")>]
let ``should validate as e-mail address`` (input) =
// arrange
let service = ValidationService.GetBasicHttpBinding_IValidationService()
// act / assert
(validateEmail service input) |> should be True
[<TestCase("not an e-mail")>]
[<TestCase("not@email")>]
[<TestCase("not@an@email")>]
[<TestCase("@notanemail")>]
[<TestCase("notanemail@")>]
[<TestCase("@notanemail@")>]
let ``should not match as an e-mail address`` (input) =
// arrange
let service = ValidationService.GetBasicHttpBinding_IValidationService()
// act / assert
(validateEmail service input) |> should be False
[<Test>]
let ``should fail validation with e-mail is null`` () =
// arrange
let service = ValidationService.GetBasicHttpBinding_IValidationService()
// act / assert
(fun () -> (validateEmail service null) |> ignore) |> should throw typeof<System.Exception>
The question that comes to mind is whether the value these web service tests are worth the effort and whether we could have achieved the same by testing the ValidationService() method directly with a unit test.
What we are testing here is how the client meets the service and how the network and transport parts of this service works together. We have intricate things such as authentication, authorization, and transport security to deal with—all that you will be happy to have tested out before deploying an SSL WCF service to production.
Summary
In this chapter, we learned about how to write integration tests and what constitutes a good integration test. We have been testing databases and dealing with their setups and teardowns. We looked at how to write tests for a web service in order to verify the complete system and its integration.
In the next chapter, we will get into a bird's-eye perspective and do black box testing, as if we didn't know about its internals. We will do this by learning some tools that will help us drive web browsers, query HTML, and express tests in a format that even your manager will appreciate.