Introduction to the Human Genome Project

The Human Genome Project made a map of the genes of a human being.
The Human Genome Project made a map of the genes of a human being. PASIEKA/SCIENCE PHOTO LIBRARY / Getty Images

The set of nucleic acid sequences or genes that form the DNA of an organism is its genome. Essentially, a genome is a molecular blueprint for constructing an organism. The human genome is the genetic code in the DNA of the 23 chromosome pairs of Homo sapiens, plus the DNA found within human mitochondria. Egg and sperm cells contain 23 chromosomes (haploid genome) consisting of around three billion DNA base pairs.

Somatic cells (e.g., brain, liver, heart) have 23 chromosome pairs (diploid genome) and around six billion base pairs. About 0.1 percent of the base pairs differ from one person to the next. The human genome is about 96 percent similar to that of a chimpanzee, the species that is the nearest genetic relative.

The international scientific research community sought to construct a map of the sequence of the nucleotide base pairs that make up human DNA. The United States government started planning the Human Genome Project or HGP in 1984 with a goal to sequence the three billion nucleotides of the haploid genome. A small number of anonymous volunteers supplied the DNA for the project, so the completed human genome was a mosaic of human DNA and not the genetic sequence of any one person.

Human Genome Project History and Timeline

While the planning stage started into 1984, the HGP didn't officially launch until 1990.

At the time, scientists estimated it would take 15 years to complete the map, but advances in technology led to completion in April of 2003 rather than in 2005. The U.S. Department of Energy (DOE) and U.S. National Institutes of Health (NIH) provided most of the $3 billion in public funding ($2.7 billion total, due to early completion).

Geneticists from all over the world were invited to participate in the Project. In addition to the United States, the international consortium included institutes and universities from the United Kingdom, France, Australia, China, and Germany. Scientists from many other countries also participated.

How Gene Sequencing Works

To make a map of the human genome, scientists needed to determine the order of the base pair on the DNA of all 23 chromosomes (really, 24, if you consider the sex chromosomes X and Y are different). Each chromosome contained from 50 million to 300 million base pairs, but because the base pairs on a DNA double helix are complementary (i.e., adenine pairs with thymine and guanine pairs with cytosine), knowing the composition of one strand of the DNA helix automatically provided information about the complementary strand. In other words, the nature of the molecule simplified the task.

While multiple methods were used to determine the code, the main technique employed BAC. BAC stands for "bacterial artificial chromosome." To use BAC, human DNA was broken into fragments between 150,000 and 200,000 base pairs in length. The fragments were inserted into bacterial DNA so that when the bacteria reproduced, the human DNA also replicated.

This cloning process provided enough DNA to make samples for sequencing. To cover the 3 billion base pairs of the human genome, about 20,000 different BAC clones were made.

The BAC clones made what is called a "BAC library" that contained all the genetic information for a human, but it was like a library in chaos, with no way to tell the order of the "books." To fix this, each BAC clone was mapped back to human DNA to find its position in relation to other clones.

Next, the BAC clones were cut into smaller fragments about 20,000 base pairs in length for sequencing. These "subclones" were loaded into a machine called a sequencer. The sequencer prepared 500 to 800 base pairs, which a computer assembled into the correct order to match the BAC clone.

As the base pairs were determined, they were made available to the public online and free to access.

Eventually all the pieces of the puzzle were complete and arranged to form a complete genome.

Goals of the Human Genome Project

The primary goal of the Human Genome Project was to sequence the 3 billion base pairs that make up human DNA. From the sequence, the 20,000 to 25,000 estimated human genes could be identified. However, the genomes of other scientifically significant species were also sequenced as part of the Project, including the genomes of the fruit fly, mouse, yeast, and roundworm. The Project developed new tools and technology for genetic manipulation and sequencing. Public access to the genome assured the entire planet could access the information to spur new discoveries.

Why the Human Genome Project Was Important

The Human Genome Project formed the first blueprint for a person and remains the largest collaborative biology project that humanity ever completed. Because the Project sequenced genomes of multiple organisms, scientist could compare them to uncover the functions of genes and to identify which genes are necessary for life.

Scientists took the information and techniques from the Project and used them to identify disease genes, devise tests for genetic diseases, and repair damaged genes to prevent problems before they occur. The information is used to predict how a patient will respond to a treatment based on a genetic profile. While the first map took years to complete, advances have led to faster sequencing, allowing scientists to study genetic variation in populations and more quickly determine what specific genes do.

The Project also included the development of an Ethical, Legal, and Social Implications (ELSI) program. ELSI became the largest bioethics program in the world and serves as a model for programs that deal with new technologies.