When Do You Use a Binomial Distribution?

Conditions for Using This Probability Distribution

A histogram for a binomial distribution.
A histogram of a binomial distribution. C.K.Taylor

Binomial probability distributions are useful in a number of settings. It is important to know when this type of distribution should be used. We will examine all of the conditions that are necessary in order to use a binomial distribution.

The basic features that we must have are for a total of n independent trials are conducted and we want to find out the probability of r successes, where each success has probability p of occurring.

There are several things stated and implied in this brief description. The definition boils down to these four conditions:

  1. Fixed number of trials
  2. Independent trials
  3. Two different classifications
  4. Probability of success stays the same for all trials

All of these must be present in the process under investigation in order to use the binomial probability formula or tables. A brief description of each of these follows.

Fixed Trials

The process being investigated must have a clearly defined number of trials that does not vary. We cannot alter this number midway through our analysis. Each trial must be performed the same way as all of the others, although the outcomes may vary. The number of trials is indicated by an n in the formula.

An example having fixed trials for a process would involve studying the outcomes from rolling a die for ten times.  Here each roll of the die is a trial. The total number of times that each trial is conducted is defined from the outset.

Independent Trials

Each of the trials has to be independent. Each trial should have absolutely no effect on any of the others. The classical examples of rolling two dice or flipping several coins illustrate independent events. Since the events are independent we are able to use the multiplication rule to multiply the probabilities together.

In practice, especially due to some sampling techniques, there can be times when trials are not technically independent. A binomial distribution can sometimes be used in these situations as long as the population is larger relative to the sample.

Two Classifications

Each of the trials is grouped under two classifications: successes and failures. Although we typically think of success as a positive thing, we should not read too much into this term. We are indicating that the trial is a success in that it lines up with what we have determined to call a success.

As an extreme case to illustrate this, suppose we are testing the failure rate of light bulbs. If we want to know how many in a batch will not work, we could define a success for our trial to be when we have a light bulb that fails to work. A failure for the trial is when the light bulb works. This may sound a bit backward, but there may be some good reasons for defining successes and failures of our trial as we have done. It may be preferable, for marking purposes, to stress that there is a low probability of a light bulb not working rather than a high probability of a light bulb working.

Same Probabilities

The probabilities of successful trials must remain the same throughout the process we are studying.

Flipping coins is one example of this. No matter how many coins are tossed, the probability of flipping a head is 1/2 each time.

This is another place where theory and practice are slightly different. Sampling without replacement can cause the probabilities from each trial to fluctuate slightly from each other. Suppose there are 20 beagles out of 1000 dogs. The probability of choosing a beagle at random is 20/1000 = 0.020. Now choose again from the remaining dogs. There are 19 beagles out of 999 dogs. The probability of selecting another beagle is 19/999 = 0.019. The value 0.2 is an appropriate estimate for both of these trials. As long as the population is large enough, this sort of estimation does not pose a problem with using the binomial distribution.