Husna, Amity University Kolkata
With the advancement in multi-omics and prevalence of high-throughput methods used for characterizing the process of translation, there’s a requirement of back-end functions and streamlined tools for processing as well as analysing the data produced by these assays. The tool should be such that it will reduce the burden of the users and help them to analyse the complex process of translation with a very minimal time spent on coding. It should empower the user to focus on biological questions.
Recently, a new user-friendly Bioconductor API and toolbox named ORFik has been introduced for studying the process of translation and its regulation. ORFik got its name from the open reading frame or ORF which is a part of the reading frame having the ability to be translated. The ORF begins with a start codon and ends with a stop codon.
Why is ORFik preferred for analysing translation?
Although various tools are there for analysing the translation, none presently support the comprehensive analysis of ribo-seq, TCP-seq and RCP-seq in combination with CAGE. Furthermore, many of these are either online tools, or they are limited to study only specific steps or only some aspects of translation.
The translation was previously studied on a gene-by-gene basis but due to the introduction of ribosome profiling (ribo-seq), translation complex profiling (TCP-seq) and later, ribosome complex profiling (RCP-seq) made it possible to obtain a snapshot of the translating and scanning ribosomes across the whole transcriptome.
Getting a functional insight from the sequencing data requires a strong computational analysis. Ribo-seq, being a mature assay, has a number of software packages and web services and it is designed specifically to only handle some features. On the other hand, TCP-seq and RCP-seq are much less supported.
Another factor that makes it complicated is that many genes have alternative transcription start sites (TSSs). The study of translation initiation requires 5′ UTRs annotations which should be very accurate. Otherwise, it’s challenging to determine which uORF should be included in the analysis. Moreover, in certain cases, uORFs are present only in specific tissues with the correct variant of the 5′ UTR. To address all of these challenges and provide a comprehensive tool for studying translation in custom regions, ORFik has been developed.
ORFik is highly optimized and fast as well. To achieve this, it has been reimplemented with several functions in its Bioconductor core package. Overall, ORFik provides a toolbox of functions that enables the user to go far beyond standardized pipelines. Hence, ORFik is a versatile API for analysing the process of translation.
How does ORFik work?
ORFik is an open-source software package in the R programming language, having its parts in C++ as well for efficiently processing large datasets. Some of the features include:
- Data extraction: It can extract data on a wide scale from the genome to the transcriptome and implement a framework that can integrate all the data which are collected from different sources. ORFik efficiently organizes the different steps of translation to be visualized, but it particularly focuses on the initiation and elongation steps.
- Accepting high-throughput sequencing data: It can accept the high-throughput sequencing data from ribosome profiling to quantify ribosome elongation as well as scanning.
- Use of CAGE and RNA-seq: ORFik can use CAGE data to accurately determine 5′UTRs and RNA-seq for determining translation relative to RNA abundance.
- Calculation and annotation functions: ORFik can even support and calculate over 30 different translation-related features and metrics from the literature and can annotate translated regions like proteins or upstream open reading frames (uORFs).
- It can also help in the detection and classification of translated uORFs and the characterization of sequence features.
As a result, ORFik can be used to rapidly annotate the dynamics of 5′ UTRs across different tissues, detect their uORFs, and characterize their scanning and translation in the downstream protein-coding regions.
Also read: Novel brain cells named “Gorditas” and “OPC” discovered
Source:
Tjeldnes, H., Labun, K., Torres Cleuren, Y. et al. ORFik: a comprehensive R toolkit for the analysis of translation. BMC Bioinformatics 22, 336 (2021). https://doi.org/10.1186/s12859-021-04254-w
- The Corrosion Prediction from the Corrosion Product Performance
- Nitrogen Resilience in Waterlogged Soybean plants
- Cell Senescence in Type II Diabetes: Therapeutic Potential
- Transgene-Free Canker-Resistant Citrus sinensis with Cas12/RNP
- AI Literacy in Early Childhood Education: Challenges and Opportunities
Gaining popularity of methanol-dependent bacteria in science
Agrima Bhatt, Rajasthan University “The best way to predict future is to create it.” – Buckminster Fuller. This saying holds power in 2021 as the society along with researchers and scientists all around various nations pave the way to build towards a green sustainable future with zero waste. From Rio Summit 2012, till now, scientists […]