Understanding Secondary Data and How to Use It in Research

How Previously Collected Data Can Inform Sociology

Businessmen, globe, financial data and folder
Stuart Kinlough / Getty Images

Within sociology, many researchers collect new data for analytic purposes, but many others rely on secondary data—data collected by somebody else—in order to conduct a new study. When a research uses secondary data, the kind of research they perform on it is called secondary analysis.

A great deal of secondary data resources and data sets are available for sociological research, many of which are public and easily accessible.


There are both pros and cons to using secondary data and conducting secondary data analysis, but the cons, for the most part, can be mitigated by learning about the methods used to collect and clean the data in the first place, and by careful usage of it and honest reporting on it.

What is Secondary Data?

Unlike primary data, which is collected by a researcher herself in order to fulfill a particular research objective, secondary data is data that was collected by other researchers who likely had different research objectives. Sometimes researchers or research organizations share their data with other researchers in order to ensure that its usefulness is maximized. In addition, many government bodies within the U.S. and around the world collect data that they make available for secondary analysis. In many cases, this data is available to the general public, but in some cases, it is only available to approved users.

Secondary data can be both quantitative and qualitative in form. Secondary quantitative data is often available from official government sources and trusted research organizations. In the U.S., the U.S. Census, the General Social Survey, and the American Community Survey are some of the most commonly used secondary data sets within the social sciences.

In addition, many researchers make use of data collected and distributed by agencies including the Bureau of Justice Statistics, the Environmental Protection Agency, the Department of Education, and the U.S. Bureau of Labor statistics, among many others at federal, state, and local levels.

While this information was collected for a wide range of purposes including budget development, policy planning, and city planning, among others, it can also be used as a tool for sociological research. By reviewing and analyzing numerical data, sociologists can often uncover unnoticed patterns of human behavior and large-scale trends within society.

Secondary qualitative data is usually found in the form of social artifacts, like newspapers, blogs, diaries, letters, and emails, among other things. Such data is a rich source of information about individuals in society and can provide a great deal of context and detail to sociological analysis.

What Is Secondary Analysis?

Secondary analysis is the practice of using secondary data in research. As a research method, it saves both time and money and avoids unnecessary duplication of research effort. Secondary analysis is usually contrasted with primary analysis, which is the analysis of primary data independently collected by a researcher.

Why Conduct Secondary Analysis?

Secondary data represents a vast resource to sociologists. It is easy to come by and often free to use. It can include information about very large populations that would be expensive and difficult to obtain otherwise. And, secondary data is available from time periods other than the present day. It is literally impossible to conduct primary research about events, attitudes, styles, or norms that are no longer present in today's world.

There are certain disadvantages to secondary data. In some cases, it may be outdated, biased, or improperly obtained. But a trained sociologist should be able to identify and work around or correct for such issues.

Validating Secondary Data Before Using It

To conduct meaningful secondary analysis, researchers must spend significant time reading and learning about the origins of the data sets.

Through careful reading and vetting, researchers can determine:

  • The purpose for which the material was collected or created
  • The specific methods used to collect it
  • The population studied and the validity of the sample captured
  • The credentials and credibility of the collector or creator
  • The limits of the data set (what information was not requested, collected, or presented)
  • The historic and/or political circumstances surrounding the creation or collection of the material

In addition, before using secondary data, a researcher must consider how the data are coded or categorized and how this might influence the outcomes of a secondary data analysis. She should also consider whether the data must be adapted or adjusted in some way prior to her conducting her own analysis.

Qualitative data is usually created under known circumstances by named individuals for a particular purpose. This makes it relatively easy to analyze the data with an understanding of biases, gaps, social context, and other issues.

Quantitative data, however, may require more critical analysis. It is not always clear how data was collected, why certain types of data were collected while others were not, or whether any bias was involved in the creation of tools used to collect the data. Polls, questionnaires, and interviews can all be designed to result in pre-determined outcomes.

While biased data can be extremely useful, it is absolutely critical that the researcher is aware of the bias, its purpose, and its extent.

Updated by Nicki Lisa Cole, Ph.D.