Neha khan, Jamia Millia Islamia
The Human Genome Project had been started, aiming at curing deadly diseases and making healthcare facilities better. The human genome project (HGP) was launched in the year 1990 and was called a megaproject. It was a collaboration of International, government, and private companies’ effort to map and sequence the entire human genome. HGP was devised by the U.S. Department of Energy and the National Institute of Health (NIH). HGP’s ultimate goal was to discover and map approximately 21,000 human genes and make them available for further research studies. After years of multi-billion-dollar research, the NIH and Celera Genomics (a non-government biotechnology company) jointly announced drafts of the human genome sequence in 2001. By mid-2001, scientists associated with these projects explained the complexity of the digital code inherent in DNA. The project was stated to complete in 2003. However, the announcement of sequencing of individual genes intricately was made later in 2006. The decoding of Human DNA was still not completed, as only about 92% of it was done by 2006. There is a major update related to the Human genome project released by Genome Reference Consortium in 2013 and the most recently identified patch in 2019 (GRCh38.p13).
- What is The Human Genome Project?
The human genome project (HGP) was a scientific project led by private companies, governments, and internationally by different organizations. U.S. Department of energy and the National Institute of Health were the first ones to initiate this project. The cost of this project was estimated at approximately $2.7billion but it ultimately cost about $5 billion. In 2019, the most recent patch of the sequence published is GRCh38.p13. The GRC is based on Sanger sequencing data which is derived from BAC (bacterial artificial chromosome) clones and they were oriented along with radiation hybrid, genetic linkage, and fingerprint maps. As earlier many gaps were unsolved but now, they have filled most of the gaps. The recent GRCh38.p13 reference genome contains 151 Mbp of unknown sequence throughout the genome. This time scientists were able to find the entire arms of all 5 acrocentric chromosomes (Chr13, Chr14, Chr15, Chr21, Chr22) and the large Human satellite arrays (e.g.: Chr1, Chr9, and Chr16).
Earlier due to limitations they were not able to read the genome from telomere to telomere due to low-cost sequencing technologies, which were only capable of sequencing a few hundred bases per read. However, recent advancements in technology made it possible to read the long genome with minimum error. The technology used is PacoBio’s single-molecule sequencing, which is polymerase-based technology. It was capable of producing multi-kilobase sequence reads with a minimum error of 15%. It is also capable of reading the complex structural variation and gaps in GRCh38. The next technology that has been used is called Oxford Nanopore’s single molecule, which is capable of sequencing “ultra-long” sequences that read about 1 Mbp but again with an error of ~15%. PacoBio’s recent “HiFi” circular concurrence sequencing offers a compromise of 20 kbp read length with an accuracy of 99.9% with minor adjustments in the standard assembly of the genome, whereas ultra-long nanopore sequencing excels at spanning long, identical repeats and HiFi excels at differentiating dedicated diverged repeated copies or haplotypes.
To get the gapless human genome assembly, scientists have taken the advantage of the complementary aspects of PacoBio HiFi and Oxford Nanopore ultra-long read sequencing, which combined with the haploid nature of the CHM13hTERT cell line (CHM13). The resulting T2T-CHM13 reference removed 20 years barrier that had hidden 8% of the genome from the sequence-based analysis.
For genome improvement, they used a complete hydatidiform mole for the sequencing. CHM genome arises from the loss of one X maternal chromosome. They selected the CHM13 for its stable 46, XX Karyotype as compared to CHMs but later found that it does possess a low level of heterozygosity. It shows the deletion within the rDNA array on chromosome 15, which was revealed by FISH and nanopore sequencing.
Human rDNA assembly is also a great challenge because earlier it was not decoded fully. Human rDNA are 45 kbp near-identical repeats that encode the 45SrRNA and are arranged in large, tandem repeat arrays embedded with the acrocentric p-arms. Now it decoded with the help of HiFi. Unlocking the fully gene first time after 20 years it’s a great achievement for our science from telomere to telomere.
- Salient features of the human genome:
A. PUBLISHED IN 2001
- It contains 3164.7 million base pairs.
- The average gene consists of 3000 bp.
- The total number of genes estimated at 2,1000.
- Chromosome 1 has a maximum number of genes (2986 and the Y have the fewest genes (231).
- The size of chromosomes is about 55Mbp-250Mbp.
- Repeated sequences make up a large portion of the human genome.
- At about 1.4 million locations occur single nucleotide differences called SNP’s (single nucleotide polymorphism).
- The function of over 50% of discovered genes is unknown.
- About 98.5% genome is non-coding.
- Approximately 1 million copies of short 5-8 bp repeated sequences clustered around
chromosome centromere part and represent as a JUNK DNA.
B. PUBLISHED IN 2019
- Here source has been usedcomplete Hyatidiform Mole with 46XX karyotype.
- 238 Mbp of new sequence been discovered.
- 8% of the Human genome has been decoded.
- The total sequence is now-3.055 billion bp.
- 2226 new paralogous copies are discovered.
- Entire p-arms of all 5 acrocentric chromosomes (13,14,15,21,22).
- A major type of new sequence is founded:
- Centromeric satellites
- Segmental duplication.
- rDNA
Also read: The discovery of Antarctica 1300 years ago
References:
- Nurk, S., Koren, S., Rhie, A., Rautiainen, M., Bzikadze, A. V., Mikheenko, A., Vollger, M. R., Altemose, N., Uralsky, L., Gershman, A., Aganezov, S., Hoyt, S. J., Diekhans, M., Logsdon, G. A., Alonge, M., Antonarakis, S. E., Borchers, M., Bouffard, G. G., Brooks, S. Y., … Phillippy, A. M. (2021). The complete sequence of a human genome [Preprint]. Genomics. https://doi.org/10.1101/2021.05.26.445798
- Collins, F. S., & Fink, L. (1995). The Human Genome Project. Alcohol health and research world, 19(3), 190–195. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6875757/
- The Corrosion Prediction from the Corrosion Product Performance
- Nitrogen Resilience in Waterlogged Soybean plants
- Cell Senescence in Type II Diabetes: Therapeutic Potential
- Transgene-Free Canker-Resistant Citrus sinensis with Cas12/RNP
- AI Literacy in Early Childhood Education: Challenges and Opportunities
Prosopagnosia- A Room Full Of Strangers Everyday!
Moumita Ghosh, Amity University Kolkata Introduction Prosopagnosia is an impairment that causes the inability to acknowledge previously known faces and to find out new faces. It is also known as “face-blindness”. It may occur as a developmental variant or after a brain lesion. Humans have acquired the flexibility of recognizing faces effortlessly, rapidly, and accurately […]