What Is a Population in Statistics?

Crowd of people crossing street
Photo by George Rose/Getty Images

In statistics, the term population is used to describe the subjects of a particular study—everything or everyone who is the subject of a statistical observation. Populations can be large or small in size and defined by any number of characteristics, though these groups are typically defined specifically rather than vaguely—for instance, a population of women over 18 who buy coffee at Starbucks rather than a population of women over 18.

Statistical populations are used to observe behaviors, trends, and patterns in the way individuals in a defined group interact with the world around them, allowing statisticians to draw conclusions about the characteristics of the subjects of study, although these subjects are most often humans, animals, and plants, and even objects like stars.

Importance of Populations

The Australian Government Bureau of Statistics notes:

It is important to understand the target population being studied, so you can understand who or what the data are referring to. If you have not clearly defined who or what you want in your population, you may end up with data that are not useful to you.  

There are, of course, certain limitations with studying populations, mostly in that it is rare to be able to observe all of the individuals in any given group. For this reason, scientists who use statistics also study subpopulations and take statistical samples of small portions of larger populations to more accurately analyze the full spectrum of behaviors and characteristics of the population at large.

What Constitutes a Population?

A statistical population is any group of individuals who are the subject of a study, meaning that almost anything can make up a population so long as the individuals can be grouped together by a common feature, or sometimes two common features. For example, in a study that is trying to determine the mean weight of all 20-year-old males in the United States, the population would be all 20-year-old males in the United States.

Another example would be a study that investigates how many people live in Argentina wherein the population would be every person living in Argentina, regardless of citizenship, age, or gender. By contrast, the population in a separate study that asked how many men under 25 lived in Argentina might be all men who are 24 and under who live in Argentina regardless of citizenship.

Statistical populations can be as vague or specific as the statistician desires; it ultimately depends on the goal of the research being conducted. A cow farmer wouldn't want to know the statistics on how many red female cows he owns; instead, he would want to know the data on how many females cows he has that are still able to produce calves. That farmer would want to select the latter as his population of study.

Population Data in Action

There are many ways that you can use population data in statistics.  StatisticsShowHowto.com explains a fun scenario where you resist temptation and walk into a candy store, where the owner might be offering a few samples of her products. You would eat one candy from each sample; you wouldn't want to eat a sample of every candy in the store. That would require sampling from hundreds of jars, and likely would make you quite sick.

Instead, the statistical website explains:

"You might base your opinion about the entire store’s candy line on (just) the samples they have to offer. The same logic holds true for most surveys in stats. You’re only going to want to take a sample of the whole population (“population” in this example would be the entire candy line). The result is a statistic about that population."

The Australian government's statistics bureau gives a couple of other examples, which have been slightly modified here. Imagine you want to study only people who live in the United States who were born overeas—a hot political topic today in light of the heated national debate on immigration. Instead, however, you accidentally looked at all people born in this country. The data include many people you do not want to study.

"You could end up with data that you do not need because your target population was not clearly defined, notes the statistics bureau. 

Another relevant study might be a look at all primary grade school children who drink soda. You would need to clearly define the target population as "primary school children" and "those who drink soda pop," otherwise, you could end up with data that included all school children (not just pupils in primary grades) and/or all of those who drink soda pop. The inclusion of older children and/or those who don't drink soda pop would skew your results and likely make the study unusable.

Limited Resources

Although the total population is what scientists wish to study, it is very rare to be able to perform a census of every individual member of the population. Due to constraints of resources, time, and accessibility, it is nearly impossible to perform a measurement on every subject. As a result, many statisticians, social scientists and others use inferential statistics, where scientists are able to study only a small portion of the population and still observe tangible results.

Rather than performing measurements on every member of the population, scientists consider a subset of this population called a statistical sample. These samples provide measurements of the individuals that tell scientists about corresponding measurements in the population, which can then be repeated and compared with different statistical samples to more accurately describe the whole population.

Population Subsets

The question of which population subsets should be selected, then, is highly important in the study of statistics, and there are a variety of different ways to select a sample, many of which will not produce any meaningful results. For this reason, scientists are constantly on the lookout for potential subpopulations because they typically obtain better results when recognizing the mixture of types of individuals in the populations being studied.

Different sampling techniques, such as forming stratified samples, can help in dealing with subpopulations, and many of these techniques assume that a specific type of sample, called a simple random sample, has been selected from the population.