# Better Business Decisions from Data: Statistical Analysis for Professional Success (2014)

### Part IV. Comparisons

### Chapter 9. General Procedure for Comparisons

**Eight Easy Steps from Null to Significance**

After you decide what is to be compared with what, you should clearly define the null hypothesis. It is very easy to later become confused between the null hypothesis and the alternative hypothesis.

The next step is to choose the acceptable level of statistical significance. It is important to fix and declare this significance level at the outset so that your choice will not be influenced by the result you obtain.

Next choose the statistical test you will use. The following three chapters will describe the appropriateness of various tests in relation to the available data and the conclusions sought. Each statistical test employs published tables from which the levels of significance can be obtained. The tables are produced from calculations that are often complex. In practice, the availability of computer programs has removed much of the need to refer to tables; the complete sequence of calculation, from the raw data to the statement of the significance level, is hidden from view. Nevertheless, it is wise to appreciate the steps that are followed within the procedure.

The number of available tests is very large, and new ones are being developed. It would be impossible to include them all. Many of the well-established tests are in common use, and I will describe them.

Statistical tests vary in their power, the *power* of a test being a measure of the likelihood of obtaining a result that is not spurious. Clearly, the test should be chosen in order to maximize the power. Tests that make no assumptions about the distribution that the data fits are less powerful than those that assume a particular distribution.

Every collection of data is unique; it would clearly be impossible to provide a table of values for each situation. The data therefore is processed to produce a standard value of what is termed a *test statistic*. In effect, data is scaled to allow a direct comparison with the standard distribution. The idea of scaling the data in order to compare it with a standard distribution was introduced in the section on standard normal distribution in Chapter 7.

The test statistic is referred to the appropriate table together with the number of degrees of freedom, or the number or numbers of data involved in the calculation of the statistic. In some situations you need to distinguish between a one-tailed test and a two-tailed test in referring to the table.

In summary, the procedure for comparing samples of data and their statistical properties is as follows:

1. Decide on the comparison to be made.

2. State the null hypothesis.

3. Decide on the required level of significance.

4. Choose the statistical test.

5. Calculate the test statistic and the degrees of freedom.

6. Note, if necessary, whether to use one-tailed or two-tailed values.

7. Refer to the tables.

8. Read off the level of significance.

So far we have dealt with descriptive data before numerical data, progressing from the simpler to the more complex. Now, however, we will consider numerical data first. This is because comparisons of numerical data have procedures that are usually better known. Furthermore, some descriptive data can be recast in numerical form and dealt with in ways that I will have already described.