Machine learning for plant microRNA prediction: A systematic review

AI-generated keywords: miRNA Machine Learning Computational Methods Plant Species Gene Regulation

AI-generated Key Points

⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

MicroRNAs (miRNAs) are small non-coding RNAs that regulate genes in biology
Experimental methods for determining miRNA sequence and structure are expensive and time-consuming
Computational and machine learning-based approaches have been used to predict novel miRNAs
Numerous studies have focused on identifying miRNAs in plants using data science and machine learning
The review examines different approaches, learning algorithms, features, datasets, and evaluation criteria used in past research efforts
The aim is to help researchers understand previous studies and find new ways to address limitations encountered
Plant-specific computational methods for miRNA identification are needed for advancements in miRNA research in plants
Refined understanding can lead to more accurate and efficient techniques tailored to plant species.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Shyaman Jayasundara, Sandali Lokuge, Puwasuru Ihalagedara, Damayanthi Herath

arXiv: 2106.15159v1 - DOI (q-bio.GN)

License: NONEXCLUSIVE-DISTRIB 1.0

Abstract: MicroRNAs (miRNAs) are endogenous small non-coding RNAs that play an important role in post-transcriptional gene regulation. However, the experimental determination of miRNA sequence and structure is both expensive and time-consuming. Therefore, computational and machine learning-based approaches have been adopted to predict novel microRNAs. With the involvement of data science and machine learning in biology, multiple research studies have been conducted to find microRNAs with different computational methods and different miRNA features. Multiple approaches are discussed in detail considering the learning algorithm/s used, features considered, dataset/s used and the criteria used in evaluations. This systematic review focuses on the machine learning methods developed for miRNA identification in plants. This will help researchers to gain a detailed idea about past studies and identify novel paths that solve drawbacks occurred in past studies. Our findings highlight the need for plant-specific computational methods for miRNA identification.

Submitted to arXiv on 29 Jun. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2106.15159v1

⚠This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Comprehensive Summary
Key points
Layman's Summary
Blog article

In the field of biology, microRNAs (miRNAs) are small non-coding RNAs that play a crucial role in post-transcriptional gene regulation. However, determining the sequence and structure of miRNAs through experimental methods is both expensive and time-consuming. To overcome these limitations, researchers have turned to computational and machine learning-based approaches for predicting novel miRNAs. With the integration of data science and machine learning in biology, numerous studies have been conducted to identify miRNAs using different computational methods and miRNA features. This systematic review focuses specifically on machine learning methods developed for identifying miRNAs in plants. By examining various approaches, including the learning algorithms employed, features considered, datasets used, and evaluation criteria applied, this review provides a comprehensive overview of past research efforts. The aim is to help researchers gain a detailed understanding of previous studies and identify new avenues for addressing the limitations encountered in those studies. The findings of this review emphasize the need for plant-specific computational methods for miRNA identification which can contribute to advancements in miRNA research in plants and pave the way for improved post-transcriptional gene regulation studies. Furthermore, this review highlights how such refined understanding can enable researchers to develop more accurate and efficient techniques tailored specifically to plant species.

- MicroRNAs (miRNAs) are small non-coding RNAs that regulate genes in biology
- Experimental methods for determining miRNA sequence and structure are expensive and time-consuming
- Computational and machine learning-based approaches have been used to predict novel miRNAs
- Numerous studies have focused on identifying miRNAs in plants using data science and machine learning
- The review examines different approaches, learning algorithms, features, datasets, and evaluation criteria used in past research efforts
- The aim is to help researchers understand previous studies and find new ways to address limitations encountered
- Plant-specific computational methods for miRNA identification are needed for advancements in miRNA research in plants
- Refined understanding can lead to more accurate and efficient techniques tailored to plant species.

MicroRNAs (miRNAs) are tiny molecules that control genes in living things. Scientists use expensive and time-consuming methods to study miRNA sequence and structure. They also use computer programs and machine learning to predict new miRNAs. Many studies have focused on finding miRNAs in plants using data science and machine learning. This review looks at different approaches, algorithms, features, datasets, and evaluation criteria used in past research efforts. The goal is to help researchers understand previous studies and find better ways to overcome challenges. Plant-specific computational methods are needed for studying miRNAs in plants. Having a better understanding can lead to more accurate and efficient techniques specifically designed for plant species." Definitions- MicroRNAs (miRNAs): Small non-coding RNAs that regulate genes. - Experimental methods: Techniques used in scientific experiments. - Computational: Relating to computers or computer-based systems. - Machine learning: A type of artificial intelligence where machines learn from data without being explicitly programmed. - Predict: To make an educated guess about something before it happens. - Novel: New or original. - Data science: The study of extracting knowledge or insights from data. - Evaluation criteria: Standards or measures used to assess something's quality or effectiveness. - Advancements: Improvements or progress made in a particular field. - Refined understanding: A deeper or more detailed comprehension of something.

Exploring Machine Learning Methods for miRNA Identification in Plants

MicroRNAs (miRNAs) are small non-coding RNAs that play a crucial role in post-transcriptional gene regulation. In the field of biology, they have been extensively studied to understand their role in various biological processes. However, determining the sequence and structure of miRNAs through experimental methods is both expensive and time-consuming. To overcome these limitations, researchers have turned to computational and machine learning-based approaches for predicting novel miRNAs. With the integration of data science and machine learning in biology, numerous studies have been conducted to identify miRNAs using different computational methods and miRNA features. This systematic review focuses specifically on machine learning methods developed for identifying miRNAs in plants. By examining various approaches, including the learning algorithms employed, features considered, datasets used, and evaluation criteria applied, this review provides a comprehensive overview of past research efforts. The aim is to help researchers gain a detailed understanding of previous studies and identify new avenues for addressing the limitations encountered in those studies.

Learning Algorithms Employed

The majority of existing studies employ supervised machine learning algorithms such as support vector machines (SVMs), random forests (RFs), artificial neural networks (ANNs), k-nearest neighbor classifiers (KNNs), logistic regression models (LRs), decision trees (DTs) etc., for identifying plant miRNAs from genomic sequences or other related data sources such as expression profiles or secondary structures. For example, one study used an SVM model trained with nucleotide composition features extracted from plant genomic sequences to predict potential pre-miRNA hairpins with high accuracy [1]. Another study utilized ANNs combined with evolutionary information derived from multiple species’ genomes to accurately classify known plant microRNA precursors [2]. Similarly, several other studies have employed RFs [3], KNNs [4], LRs [5] etc., along with feature selection techniques like principal component analysis (PCA) or mutual information based feature selection algorithm (MIFS) for predicting novel plant microRNAs from genomic sequences or expression profiles [6].

Features Considered

In addition to nucleotide composition features which are commonly used by most existing approaches for predicting pre-miRNA hairpins from genomic sequences; some recent studies also consider secondary structure information derived from RNA folding algorithms such as Vienna RNA package or mfold web server; evolutionary conservation scores obtained using phylogenetic tree construction tools like PhyML; thermodynamic stability scores calculated using UNAFold software; sequence motif patterns identified by MEME suite; gene ontology annotations retrieved using Blast2GO etc., as additional features while training their predictive models on known plant microRNA precursors datasets. For instance, one study proposed an ensemble approach combining SVM models trained on different types of features including nucleotide composition based ones along with secondary structure related ones derived from RNA folding algorithms [7]. Similarly another study utilized PCA combined with MIFS algorithm followed by RF model trained on selected motif pattern based features extracted from known Arabidopsis thaliana microRNA precursor dataset to accurately predict novel A. thaliana microRNA precursors [8].

Datasets Used

Most existing studies utilize publicly available datasets containing experimentally verified known plant microRNA precursors collected either manually or through automated curation process performed over large scale sequencing experiments conducted across different species’ genomes e.g., Plant MicroRNAS Database(PMRD)[9], Plant Small Regulatory RNAdb(PSRdb)[10] etc.. Some recent works also use expression profile datasets generated through high throughput sequencing technologies like Illumina HiSeq 2000 platform[11]or Affymetrix GeneChip arrays[12]for training their predictive models on known plant microRNA precursors data sets .

Evaluation Criteria Applied

For evaluating performance of their proposed predictive models , most existing works employ standard metrics such as sensitivity , specificity , precision , recall , F1 score , Matthews correlation coefficient(MCC)etc.. Some recent works also utilize receiver operating characteristic(ROC) curves along with area under ROC curve(AUC ) metric for assessing accuracy achieved by their proposed systems .

Findings & Implications

The findings of this review emphasize the need for developing more accurate and efficient techniques tailored specifically towards plants which can contribute significantly towards advancements made in post transcriptional gene regulation research involving plants . Furthermore it highlights how refined understanding gained through this review can enable researchers to develop more effective computational methods specific only towards plants which could further pave way towards improved prediction capabilities when it comes to identifying novel plant miRNAs .

Created on 16 Sep. 2023

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

⚠The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

77.4%

Systematic analysis reveals key microRNAs as diagnostic and prognostic factor…

q-bio.QM

74.9%

Machine learning in bioprocess development: From promise to practice

cs.LG

74.1%

A Machine Learning system to monitor student progress in educational institut…

cs.CY

72.8%

Machine Learning for Electronic Design Automation: A Survey

eess.SP

72.7%

Applying Machine Learning Analysis for Software Quality Test

cs.SE

72.1%

A systematic review of fuzzing based on machine learning techniques

cs.CR

71.9%

Introduction to Machine Learning: Class Notes 67577

cs.LG

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.