The negative binomial distribution is a probability distribution that is used with discrete random variables. This type of distribution concerns the number of trials that must occur in order to have a predetermined number of successes. As we will see, the negative binomial distribution is related to the binomial distribution. In addition, this distribution generalizes the geometric distribution.

### The Setting

We will start by looking at both the setting and the conditions that give rise to a negative binomial distribution. Many of these conditions are very similar to a binomial setting.

- We have a Bernoulli experiment. This means that each trial we perform has a well-defined success and failure and that these are the only outcomes.
- The probability of success is constant no matter how many times we perform the experiment. We denote this constant probability with a
*p.* - The experiment is repeated for
*X*independent trials, meaning that the outcome of one trial has no effect on the outcome of a subsequent trial.

These three conditions are identical to those in a binomial distribution. The difference is that a binomial random variable has a fixed number of trials *n.* The only values of *X *are 0, 1, 2, ..., *n, *so this is a finite distribution.

A negative binomial distribution is concerned with the number of trials *X* that must occur until we have *r *successes.

The number *r *is a whole number that we choose before we start performing our trials. The random variable *X *is still discrete. However, now the random variable can take on values of *X = **r, r+1, r+2, ... *This random variable is countably infinite, as it could take an arbitrarily long time before we obtain *r* successes.

### Example

To help make sense of a negative binomial distribution, it is worthwhile to consider an example. Suppose that we flip a fair coin and we ask the question, "What is the probability that we get three heads in the first *X *coin flips?" This is a situation that calls for a negative binomial distribution.

The coin flips have two possible outcomes, the probability of success is a constant 1/2, and the trials they are independent of one another. We ask for the probability of getting the first three heads after *X *coin flips. Thus we have to flip the coin at least three times. We then keep flipping until the third head appears.

In order to calculate probabilities related to a negative binomial distribution, we need some more information. We need to know the probability mass function.

### Probability Mass Function

The probability mass function for a negative binomial distribution can be developed with a little bit of thought. Every trial has a probability of success given by *p. *Since there are only two possible outcomes, this means that the probability of failure is constant (1 - *p *).

The *r*th success must occur for the *x*th and final trial. The previous *x* - 1 trials must contain exactly *r - 1 *successes.

The number of ways that this can occur is given by the number of combinations:

C(*x *- 1, *r *-1) = (x - 1)!/[(r - 1)!(*x - r*)!].

In addition to this we have independent events, and so we can multiply our probabilities together. Putting all of this together, we obtain the probability mass function

*f*(*x*) =C(*x *- 1, *r *-1) *p*^{r}(1 - *p*)^{x - r}.

### The Name of the Distribution

We are now in a position to understand why this random variable has a negative binomial distribution. The number of combinations that we encountered above can be written differently by setting *x - r = k:*

(x - 1)!/[(r - 1)!(*x - r*)!] = (*x + k *- 1)!/[(r - 1)!* k*!] = (*r + k *- 1)(*x + k *- 2) . . . (r + 1)(r)/*k*! = (-1)^{k}(-r)(-r - 1). . .(-r -(k + 1)/k!.

Here we see the appearance of a negative binomial coefficient, which is used when we raise a binomial expression (a + b) to a negative power.

### Mean

The mean of a distribution is important to know because it is one way to denote the center of the distribution. The mean of this type of random variable is given by its expected value and is equal to *r* / *p*. We can prove this carefully by using the moment generating function for this distribution.

Intuition guides us to this expression as well. Suppose that we perform a series of trials *n*_{1} until we obtain *r *successes. And then we do this again, only this time it takes *n*_{2} trials. We continue this over and over, until we have a large number of groups of trials *N *= *n*_{1 } + *n*_{2 }+ . . . +_{ } *n*_{k.}

Each of these *k *trials contains *r *successes, and so we have a total of *kr *successes. If *N *is large, then we would expect to see about *Np *successes. Thus we equate these together and have *kr = Np.*

We do some algebra and find that *N / k = r / p. *The fraction on the left-hand side of this equation is the average number of trials required for each of our *k *groups of trials. In other words, this is the expected number of times to perform the experiment so that we have a total of *r *successes. This is exactly the expectation that we wish to find. We see that this is equal to the formula *r / p.*

### Variance

The variance of the negative binomial distribution can also be calculated by using the moment generating function. When we do this we see the variance of this distribution is given by the following formula:

r(1 - *p*)/*p*^{2}

### Moment Generating Function

The moment generating function for this type of random variable is quite complicated.

Recall that the moment generating function is defined to be the expected value E[e^{tX}]. By using this definition with our probability mass function, we have:

M(t) = E[e^{tX}] = Σ (x - 1)!/[(r - 1)!(*x - r*)!]e^{tX}*p*^{r}(1 - *p*)^{x - r}

After some algebra this becomes M(t) = (pe^{t})^{r}[1-(1- p)e^{t}]^{-r}

### Relationship to Other Distributions

We have seen above how the negative binomial distribution is similar in many ways to the binomial distribution. In addition to this connection, the negative binomial distribution is a more general version of a geometric distribution.

A geometric random variable *X *counts the number of trials necessary before the first success occurs. It is easy to see that this is exactly the negative binomial distribution, but with *r* equal to one.

Other formulations of the negative binomial distribution exist. Some textbooks define *X *to be the number of trials until *r *failures occur.

### Example Problem

We will look at an example problem to see how to work with the negative binomial distribution. Suppose that a basketball player is an 80% free throw shooter. Further, assume that making one free throw is independent of making the next. What is the probability that for this player the eighth basket is made on the tenth free throw?

We see that we have a setting for a negative binomial distribution. The constant probability of success is 0.8, and so the probability of failure is 0.2. We want to determine the probability of X=10 when r = 8.

We plug these values into our probability mass function:

f(10) =C(10 -1, 8 - 1) (0.8)^{8}(0.2)^{2}= 36(0.8)^{8}(0.2)^{2}, which is approximately 24%.

We could then ask what is the average number of free throws shot before this player makes eight of them. Since the expected value is 8/0.8 = 10, this is the number of shots.