A paradox is a statement or phenomenon that on the surface seems contradictory. Paradoxes help to reveal the underlying truth beneath the surface of what appears to be absurd. In the field of statistics, Simpson's paradox demonstrates what kinds of problems result from combining data from several groups.

With all data, we need to exercise caution. Where did it come from? How was it obtained? And what is it really saying? These are all good questions that we should ask when presented with data. The very surprising case of Simpson's paradox shows us that sometimes what the data seem to be saying is not really the case.

## An Overview of the Paradox

Suppose we are observing several groups, and establish a relationship or correlation for each of these groups. Simpson’s paradox says that when we combine all of the groups together and look at the data in aggregate form, the correlation that we noticed before may reverse itself. This is most often due to lurking variables that have not been considered, but sometimes it is due to the numerical values of the data.

## Example

To make a little more sense of Simpson's paradox, let's look at the following example. In a certain hospital, there are two surgeons. Surgeon A operates on 100 patients, and 95 survive. Surgeon B operates on 80 patients and 72 survive. We are considering having surgery performed in this hospital and living through the operation is something that is important. We want to choose the better of the two surgeons.

We look at the data and use it to calculate what percentage of surgeon A's patients survived their operations and compare it to the survival rate of the patients of surgeon B.

- 95 patients out of 100 survived with surgeon A, so 95/100 = 95% of them survived.
- 72 patients out of 80 survived with surgeon B, so 72/80 = 90% of them survived.

From this analysis, which surgeon should we choose to treat us? It would seem that surgeon A is the safer bet. But is this really true?

What if we did some further research into the data and found that originally the hospital had considered two different types of surgeries, but then lumped all of the data together to report on each of its surgeons. Not all surgeries are equal, some were considered high-risk emergency surgeries, while others were of a more routine nature that had been scheduled in advance.

Of the 100 patients that surgeon A treated, 50 were high risk, of which three died. The other 50 were considered routine, and of these 2 died. This means that, for a routine surgery, a patient treated by surgeon A has a 48/50 = 96% survival rate.

Now we look more carefully at the data for surgeon B and find that of 80 patients, 40 were high risk, of which seven died. The other 40 were routine and only one died. This means that a patient has a 39/40 = 97.5% survival rate for a routine surgery with surgeon B.

Now which surgeon seems better? If your surgery is to be a routine one, then surgeon B is actually the better surgeon. If we look at all surgeries performed by the surgeons, A is better. This is quite counterintuitive. In this case, the lurking variable of the type of surgery affects the combined data of the surgeons.

## History of Simpson's Paradox

Simpson’s paradox is named after Edward Simpson, who first described this paradox in the 1951 paper "The Interpretation of Interaction in Contingency Tables" from the *Journal of the Royal Statistical Society*. Pearson and Yule each observed a similar paradox half a century earlier than Simpson, so Simpson’s paradox is sometimes also referred to as the Simpson-Yule effect.

There are many wide-ranging applications of the paradox in areas as diverse as sports statistics and unemployment data. Any time that data is aggregated, watch out for this paradox to show up.