Statistical sampling is used quite often in statistics. In this process we aim to determine something about a population. Since populations are typically large in size, we form a statistical sample by selecting a subset of the population that is of a predetermined size. By studying the sample we can use inferential statistics to determine something about the population.

A statistical sample of size *n* involves a single group of *n* individuals or subjects that have been randomly chosen from the population. Closely related to the concept of a statistical sample is a sampling distribution.

### Origin of Sampling Distributions

A sampling distribution occurs when we form more than one simple random sample of the same size from a given population. These samples are considered to be independent of one another. So if an individual is in one sample, then it has the same likelihood of being in the next sample that is taken.

We calculate a particular statistic for each sample. This could be a sample mean, a sample variance or a sample proportion. Since a statistic depends upon the sample that we have, each sample will typically produce a different value for the statistic of interest. The range of the values that have been produced is what gives us our sampling distribution.

### Sampling Distribution for Means

For an example we will consider the sampling distribution for the mean. The mean of a population is a parameter that is typically unknown. If we select a sample of size 100, then the mean of this sample is easily computed by adding all values together and then dividing by the total number of data points, in this case 100. One sample of size 100 may give us a mean of 50. Another such sample may have a mean of 49. Another 51 and another sample could have mean of 50.5.

The distribution of these sample means gives us a sampling distribution. We would want to consider more than just four sample means as we have done above. With several more sample means we would have a good idea of the shape of the sampling distribution.

### Why Do We Care?

Sampling Distributions may seem fairly abstract and theoretical. However, there are some very important consequences from using these. One of the main advantages is that we eliminate the variability that is present in statistics.

For instance, suppose we start with a population with mean of μ and standard deviation of σ. The standard deviation gives us a measurement of how spread out the distribution is. We will compare this to a sampling distribution obtained by forming simple random samples of size *n*. The sampling distribution of the mean will still have mean of μ, but the standard deviation is different. The standard deviation for a sampling distribution becomes σ/√ *n*.

Thus we have the following

- A sample size of 4 allows us to have a sampling distribution with standard deviation of σ/2.
- A sample size of 9 allows us to have a sampling distribution with standard deviation of σ/3.
- A sample size of 25 allows us to have a sampling distribution with standard deviation of σ/5.
- A sample size of 100 allows us to have a sampling distribution with standard deviation of σ/10.

### In Practice

In the practice of statistics we rarely form sampling distributions. Instead we treat statistics derived from a simple random sample of size *n* as if they are one point along a corresponding sampling distribution. This emphasizes again why we desire to have relatively large sample sizes. The larger the sample size, the less variation that we will obtain in our statistic.

Note that, other than the center and spread, we are unable to say anything about the shape of our sampling distribution. It turns out that under some fairly broad conditions, the Central Limit Theorem can be applied to tell us something quite amazing about the shape of a sampling distribution.