# Better Business Decisions from Data: Statistical Analysis for Professional Success (2014)

### Part VI. Forecasts

*Prediction is very difficult, especially about the future.*

—Niels Bohr

So far we have been examining the use of statistics in describing present or past situations. Usually, of course, we gain such understanding in order to make decisions for the future—in other words, we are interested in forecasting. In this part, we shall see the ways in which statistics can help in forecasting.

### Chapter 17. Extrapolation

**Malthus Got It Wrong**

Thomas Malthus, an English cleric, economist, and statistician, is known for his theories on population growth. He wrote in 1798 in his *Essay on the Principle of Population* that, because population increases geometrically (1, 2, 4, 8, …) and food increases arithmetically (1, 2, 3, 4, …), the population would eventually outstrip food supply. He warned of premature death visiting the human race. The onset of disaster would be prevented only by epidemics, pestilence, plague, famine, and preventive measures. His numerous writings on the subject gave rise to the *Malthusian doctrine*.

This doctrine is based on an extrapolation; and because it was proposed a long time ago, we can see that it was unjustified. It is relevant at the present time in providing us with a striking example of the dangers of extrapolation.

No one knows what tomorrow will bring. No one can predict the future with certainty. Of course, some events we can be fairly sure of: no one doubts that the Sun will rise tomorrow, but this is not the kind of event that statistics is asked to give a judgment on. Statistics is based on observations and measurements that relate to the past, but the purpose of statistics, apart from providing interesting historical facts, is to attempt to predict the future. Every shopkeeper who orders goods from his suppliers is indulging in forecasting. How can he be sure how many customers he will have tomorrow?

In the Mega Millions lottery, each number has an equal chance of being drawn. Although a few individuals may have doubts, most people would accept that this is true. In spite of this underlying knowledge, many people think a degree of forecasting is possible. They argue that each number will eventually appear the same number of times. In fact, the probability of each number being drawn exactly the same number of times is small, although the number of appearances of each number is likely to be approximately the same. If number 23, say, is lagging behind in number of appearances, some forecasters conclude that 23 now has a greater chance of being drawn. Others, perhaps, with a more cynical view, argue that there must be a reason why 23 is appearing less frequently and conclude that the trend is likely to continue. I happen to have a penny that I have tossed five times, and it has given five heads. Would anyone like to purchase it? It could win a fortune on its next toss! Our two groups of forecasters would, however, disagree about whether the next toss would result in a head or a tail.

Statistics alone cannot provide reliable forecasting. Common sense and judgment are needed, but both of these involve a degree of subjectivity. Objectivity is what is ideally required, and statistics can contribute in providing objective analyses. Forecasts based solely on subjective judgments can be useless or even disastrous. The gambling industry thrives on the fact that people are generally not very good at making forecasts. Suppose a successful jockey has not had a win in his last four races. Some would therefore be tempted to predict a win in his next race. Others would offer the alternative argument that his poor present performance is likely to continue. This situation has parallels in the business world. If the number of customers was unusually low today, does that allow us to say that tomorrow will bring extra customers to keep up the average, or does it allow us to argue that the trend will continue and produce fewer customers?

Forecasting is an essential activity, and in spite of the difficulties and pitfalls, we have to accept that it is always going to be with us. Forecasting the future can be based only on knowledge of the past and present. In order to use existing data to predict what the corresponding data will be in the future, we have to employ extrapolation to some degree. This creates a serious problem at the outset. We can never be certain that the same circumstances will exist in the future, and we can therefore never be certain that our forecasts will be reliable. From a strict mathematical view, we should never extrapolate data beyond the limits within which the data was obtained. Thus, if we observe that the population of our town has grown by an average of 1,000 per year over the past 10 years, we would be unjustified in deducing that the next ten years will bring a further increase of 10,000. We might, of course, consider it reasonable to bend the rule and assume that next year will bring an increase of about 1,000, the degree of extrapolation being relatively small.

We can distinguish different degrees of extrapolation to allow judgments as to the reasonableness of the extrapolations we encounter. Starting with a trivial situation, if there is perfect correlation between two variables, we expect no problems with extrapolation. If we know the volume occupied by a kilogram of sugar, we can reliably forecast the space required to store 1,000 kilograms of sugar. If we know that £1 can be exchanged for $2, we can predict with certainty how many dollars we will get for £100.

When a well-established scientific law relates a number of variables, it is possible to make reliable forecasts. The speed of a satellite circling the Earth is related to its height above the Earth, for example. If it were not for the ability to predict from such relationships, technology could not advance in the way it does. Of course, even well-used relationships have practical limitations. A spring extends in proportion to the weight applied to it, but if it is over-stretched the relationship changes.

Many laws, rather than being based on basic physical principles, are empirical and may have complex and changeable causes. The law of supply and demand, for example, can be justified experimentally and theoretically but may not always apply. Special circumstances can arise that upset expectations.