- Written By
Harshitha A
- Last Modified 25-01-2023
Human Genome Project: Goals, Process and Applications
The genetic make-up of an organism or an individual lies in the DNA sequence. If two individuals differ, then their DNA sequences should also be different. These assumptions led to the quest of finding out the complete DNA sequence of the Human Genome, and this led to the mega project called Human Genome Project (HGP) that was launched in the year \(1990.\) This article covers the process, goals, applications, salient features and much more about the Human Genome Project.
What is the Human Genome Project?
Human Genome Project was the international collaborative research programme whose goal was to map and understand all the genes present in human beings, i.e., the genome of humans and the sequence of nucleotides in the genome. The Human Genome Project has revealed that there are probably about \(30,000\) human genes present in the entire genome. The ultimate result of HGP is ‘the detailed information about structure, organisation and function of the complete set of human genes.’
Fig: Human Genome Project
Pioneers in Human Genome Project
- Robert Sins Heimer proposed the idea of sequencing the human genome in the year \(1985.\)
- Charles DeLisi and David Smith proposed the budget for the Human Genome Project.
- The human Genome Project act was passed in the U.S. Congress under President Regan in \(1988.\)
- James Watson headed the NIH Genome Program (NIH – National Institute of Health).
- Francis Collins succeeded James Watson in \(1993\) as the overall Project head, and the director of the NIH (which later became the National Human Genome Research Institute – NHGRI) and was in power until the completion of HGP in \(2003.\)
- Jim Kent, a PhD scholar at the University of California Santa Cruz (UCSC), in May \(2000,\) developed a software, Gig Assembler, that allowed the publicly funded Human Genome Project to assemble and publish the human genome sequence.
Why is Human Genome Project Called a Megaproject?
The Human genome project as “Megaproject” was a \(13-\)year-project coordinated by the U.S. Department of Energy and the National Institute of Health. Soon Wellcome Trust (U.K.) joined the project as a major partner; additional contributions came from Japan, France, Germany, China and others. The project was completed in \(2003.\) HGP has been called a megaproject due to
- Huge cost estimated to be \(9\) billion U.S. dollars, the cost of sequencing \({\rm{1bp}}\) is US\(\$ 3.\)
- A very large number of base pairs \(\left( {3 \times {{10}^9}\,{\rm{bp}}} \right)\) to be identified and sequenced.
- It required a large number of scientists, technicians and supporting staff.
- Storage of data generated which requires some \(3300\) books, each with \(1000\) pages and each page having \(1000\) typed letters. However, high-speed computational devices for storage, retrieval and analysis of data made it easier to do the same.
- The science of Bioinformatics also developed during this period and helped HGP.
Goals of Human Genome Project
Some of the important goals of HGP are as follows:
- Determine the sequences of the three billion chemical base pairs that make up human DNA.
- Identify all the approximately \(20,000\) to \(25,000\) genes in human DNA.
- Store this information in databases.
- Improve tools for data analysis.
- Transfer related technologies to other sectors, such as pharma industries.
- Address the ethical, legal and social issues (ELSI) that may arise from the project.
- Sequencing of model organisms: Non-human organisms DNA sequences can lead to an understanding of their natural capabilities that can be applied towards solving challenges in health- care, agriculture, energy production, environmental remediation. Many non-human model organisms such as bacteria, yeast, Caenorhabditis elegans (a-living non-pathogenic nematode), Drosophila, plants like rice and Arabidopsis, etc., have been sequenced.
Fig: Goals of Human Genome Project
Methods of Human Genome Project
The Human Genome Project involves two methods that play a significant role in this HGP.
- The first method involves expressed sequence tags. ESTs are mainly concerned with the sequencing of DNA that undergoes transcription to form mRNA, which on translation forms a protein. Thus, ESTs are mainly concerned with the sequencing of DNA segments that act as genes.
- The second method is sequence annotation; this is the one in which the whole genome was initially sequenced, and then the genes were later placed under different categories and labelled accordingly.
Process of Human Genome Project
- The complete DNA was isolated from a cell.
- The DNA was then divided into small fragments using restriction enzymes.
- These fragments were then amplified with the help of a commonly used vector which is usually known as BAC (Bacterial artificial chromosomes) and YAC (Yeast artificial chromosomes).
- The fragments were sequenced using automated DNA sequencers that worked on the principle of a method developed by Frederick Sanger.
- These sequences were then arranged based on some overlapping regions present in them.
- This required generation of overlapping fragments for sequencing.
- All the information of this genome sequence was then stored in a computer-based program.
- These sequences were subsequently annotated and were assigned to each chromosome.
- In this way, the entire genome was sequenced and stored as a genome database in computers.
- Genome mapping was the next goal that was achieved with the help of microsatellites, i.e., the repetitive DNA sequences.
Fig: Process of the Human Genome Project
Salient Features of Human Genome Project
Some of the salient observations drawn from the human genome project are as follows:
- The human genome contains \(3164.7\) million or \(3.2 \times {10^9}\) nucleotide bases.
- The average gene consists of \(3000\) bases, but size varies greatly, with the largest known human gene being dystrophin as \(2.4\) million bases and TDF gene as the smallest gene with \(14\) bases.
- The total number of genes in the human genome is estimated to be about \(30,000,\) which is much lower than previous estimates of \(80,000\) to \(1,40,000\) genes.
- \(99.9\% \) nucleotide sequence is exactly the same in all people.
- The functions are unknown for over \(50\% \) of discovered genes.
- Less than \(2\% \) of the genome codes for proteins, i.e., only \(2\% \) of the genome is euchromatin.
- Repeated sequences \(98\% \) make up a very large portion of the human genome, i.e., \(98\% \) of the genome is heterochromatin.
- Repetitive sequences are stretches of DNA sequences that are repeated many times, sometimes a hundred to thousand times. They are thought to have no direct coding functions, but they shed light on chromosome structure, dynamics, and evolution.
- Chromosome \(1\) has the most genes \((2968),\) and the \({\rm{Y}}\) has the fewest genes \((231).\)
- Scientists have identified about \(1.4\) million locations (\(0.1\% \) genome) where single-base DNA differences occur in humans. This is known as SNPs–single nucleotide polymorphism. This information promises to revolutionize the process of finding chromosomal locations for disease-associated sequences and tracing human history.
- \(98\% \) of our genome is similar to chimpanzees, and \(96\% \) is similar to gorillas, showing close evolutionary linkage.
Applications of Human Genome Project
Some of the applications of the Human Genome Project are as follows:
- The sequencing of the human genome holds benefits for many fields, from molecular medicine to human evolution.
- It helps in the identification of mutations that are linked to different forms of cancer.
- It helps in identifying the disease-causing gene in our body.
- The sequence of the DNA is stored in databases that are available to anyone on the internet.
- The U.S National Centre for Biotechnology Information stores the gene sequence in a database known as GenBank, along with sequences of known and hypothetical genes and proteins.
- The Genomics and Bioinformatics branch has developed due to this project.
- It has improved forensic science where genetic fingerprinting helps to match a suspect to the biological material found at a crime scene.
Limitations of Human Genome Project
- The project was not able to sequence the entire DNA found in human cells.
- It sequenced only euchromatic regions of the genome, which make up \(2\% \) of the human genome.
- The other regions, called heterochromatic, are found in centromeres and telomeres and were not sequenced under the project.
Summary
Human Genome Project (HGP) is a megaproject associated with sequencing the human genome as well as various model organisms, which started in the year \(1990\) and completed in the year \(2003.\) HGP was carried out using ESTs and sequence annotation methods. With the establishment of genetic engineering techniques, it became easy and possible to isolate and clone any piece of DNA. The availability of simple and fast techniques for determining DNA sequence helped us to get the data of the human genome.
Human Genome Project gave vital information regarding the human genome, i.e., the number of genes on each chromosome, functional analysis that is to relate each gene with its functions, size of the genome, largest and the smallest gene of the human genome, SNPs identification that acts as the basis for DNA fingerprinting, and genes responsible for causing various diseases. Branches like genomics and bioinformatics also developed due to the human genome project.
FAQs
Q.1. What are the goals of the human genome project?
Ans: Some of the important goals of HGP are as follows:
1. Determine the sequences of the three billion chemical base pairs that make up human DNA.
2. Identify all the approximately \(20,000\) to \(25,000\) genes in human DNA.
3. Store this information in databases.
4. Improve tools for data analysis.
5. Transfer related technologies to other sectors, such as pharma industries.
6. Address the ethical, legal and social issues (ELSI) that may arise from the project.
7. Sequencing of model organisms
Q.2. What is the main outcome of the Human Genome Project?
Ans: Some of the salient observations drawn from the human genome project are as follows:
1. The human genome contains \(3164.7\) million or \(3.2 \times {10^9}\) nucleotide bases.
2. The average gene consists of \(3000\) bases, but size varies greatly, with the largest known human gene being dystrophin as \(2.4\) million bases and TDF gene as the smallest gene with \(14\) bases.
3. The total number of genes in the human genome is estimated to be about \(30,000,\) which is much lower than previous estimates of \(80,000\) to \(1,40,000\) genes.
4. \(99.9\% \) nucleotide sequence is exactly the same in all people.
Q.3. Who proposed the idea of HGP?
Ans: Robert Sins Heimer proposed the idea of sequencing the human genome in the year \(1985.\) James Watson was the first chairman of the project.
Q.4. What are the benefits of the Human Genome Project?
Ans: The benefits of the Human Genome Project are:
1. Improved diagnosis of diseases like cancers or mutation.
2. Earlier detection of genetic predispositions to disease.
3. Gene therapy and control systems for drugs.
4. It has improved forensic science.
Q.5. What is the human genome project used for?
Ans: The Human Genome Project was the international research effort to determine the DNA sequence of the entire human genome.
We hope this detailed article on Human Genome Project helps you in your preparation. If you get stuck do let us know in the comments section below and we will get back to you at the earliest.