Madhavi Bhatia, National Institute of Pharmaceutical Education and Research Guwahati
Taxonomy is a branch of science that classifies organisms into hierarchical forms based on their similar characteristics into various groups. There are various fields like metagenomics, clinical forensic medicine that depend on fully annotated taxonomic data for identification and grouping of organisms present within the sample to taxonomic ranks such as family, order, class, or phylum.
The taxonomic data can be obtained from various databases such as the Catalogue of Life, which acts as a backbone for various projects like Encyclopedia of Life, Tree of Life, and GBIF. However, the analysis that is dependent on molecular sequences uses the NCBI taxonomy. The NCBI taxonomy is a reference taxonomy database that has a collection of taxonomic names and lineages of the organisms that have a register of their DNA or protein sequence in one of the databases that are present in the International Nucleotide Sequence Database Collaboration (INSDC).
NCBI Taxonomy database
The taxonomic classification in NCBI Taxonomy follows the phylogenetic taxonomy scheme which reflects views from taxonomic and molecular systematic literature and the information is organized in the form tree. But in the NCBI database, there are some taxonomic lineages, without taxonomic rank which is denoted as ‘no rank’ or ‘clade’ rank. The ranks are useful because they add phylogenetic separations in the hierarchy.
Taxallnomy
Taxallnomy is an extension of the taxonomic database of NCBI Taxonomy that consists of hierarchically complete trees and all taxonomic lineages have all nodes of each rank level. It provides names for all ranks that are missing in the taxonomic lineage in the NCBI database. The taxallnomy tree has 41 hierarchical levels that correspond to the 41 taxonomic ranks that are present in the NCBI Taxonomy. From this database, the person can obtain the complete taxonomic lineage with 41 nodes, for all taxa available in the NCBI Taxonomy. Taxallnomy is used in any bioinformatics analysis whose information depends on the NCBI Taxonomy.
The taxallnomy algorithm principle is that it creates new nodes (nodes of types 2 and 3). It may also assign a rank to a pre-existing node (nodes of type 1) along with a hierarchical structure. The algorithm also creates adequate names for those nodes. The missing ranks in the lineages are filled by taking the taxon name of the first taxon of a lower or a higher rank level. The preference of taking a higher rank level over a lower rank is conceptual. The node of type 3 is created when a hierarchical lineage lacks the higher rank level. Then the taxallnomy algorithm uses the name of the last node of the lineage to assign rank. By using this algorithm, ranks are assigned to more than 99% of all ‘no rank’ taxa without disarranging the rank hierarchy that is already established by the ranked taxa.
Conclusion
Metagenomics and several other bioinformatics analyses, tools rely on taxonomic data provided by NCBI Taxonomy. But the absence of some ranks in the lineages might create problems in the proper analysis of the sample. To avoid this, the taxallnomy algorithm takes the taxonomic tree from NCBI Taxonomy and it is hierarchically completed according to the taxonomic ranks.
Also read: Early animals probably survived the Neoproterozoic era
Reference:
- Sakamoto, T., & Ortega, J. M. (2021). Taxallnomy: An extension of NCBI Taxonomy that produces a hierarchically complete taxonomic tree. BMC Bioinformatics, 22(1), 388. https://doi.org/10.1186/s12859-021-04304-3
- The Corrosion Prediction from the Corrosion Product Performance
- Nitrogen Resilience in Waterlogged Soybean plants
- Cell Senescence in Type II Diabetes: Therapeutic Potential
- Transgene-Free Canker-Resistant Citrus sinensis with Cas12/RNP
- AI Literacy in Early Childhood Education: Challenges and Opportunities
About the author
Madhavi Bhatia is currently pursuing a Master of Science in Pharmaceutical Biotechnology from NIPER, Guwahati. Her area of interest lies in understanding the role of gene mutation in the development of various diseases and developing a treatment for such diseases.
Publications:
1. https://bioxone.in/news/worldnews/association-of-kidney-and-liver-biomarkers-with-mets/
2. https://bioxone.in/news/worldnews/mitotox-a-database-about-mitochondrial-toxicity/
3. https://bioxone.in/news/worldnews/si-rna-nanoparticles-used-to-treat-neuroblastoma/
One thought on “Taxallnomy- A taxonomic database that produces hierarchically complete taxonomic lineages”