TY - DATA T1 - miR-BAG: Bagging Based Identification of MicroRNA Precursors PY - 2012/09/25 AU - Ashwani Jha AU - Rohit Chauhan AU - Mrigaya Mehra AU - Heikham Russiachand Singh AU - Ravi Shankar UR - https://plos.figshare.com/articles/dataset/miR_BAG_Bagging_Based_Identification_of_MicroRNA_Precursors/119637 DO - 10.1371/journal.pone.0045782 L4 - https://ndownloader.figshare.com/files/302380 L4 - https://ndownloader.figshare.com/files/302647 L4 - https://ndownloader.figshare.com/files/302745 L4 - https://ndownloader.figshare.com/files/302973 L4 - https://ndownloader.figshare.com/files/302995 L4 - https://ndownloader.figshare.com/files/303126 L4 - https://ndownloader.figshare.com/files/303152 L4 - https://ndownloader.figshare.com/files/303232 KW - bagging KW - based KW - microrna KW - precursors N2 - Non-coding elements such as miRNAs play key regulatory roles in living systems. These ultra-short, ∼21 bp long, RNA molecules are derived from their hairpin precursors and usually participate in negative gene regulation by binding the target mRNAs. Discovering miRNA candidate regions across the genome has been a challenging problem. Most of the existing tools work reliably only for limited datasets. Here, we have presented a novel reliable approach, miR-BAG, developed to identify miRNA candidate regions in genomes by scanning sequences as well as by using next generation sequencing (NGS) data. miR-BAG utilizes a bootstrap aggregation based machine learning approach, successfully creating an ensemble of complementary learners to attain high accuracy while balancing sensitivity and specificity. miR-BAG was developed for wide range of species and tested extensively for performance over a wide range of experimentally validated data. Consideration of position-specific variation of triplet structural profiles and mature miRNA anchored structural profiles had a positive impact on performance. miR-BAG’s performance was found consistent and the accuracy level was observed to be >90% for most of the species considered in the present study. In a detailed comparative analysis, miR-BAG performed better than six existing tools. Using miR-BAG NGS module, we identified a total of 22 novel miRNA candidate regions in cow genome in addition to a total of 42 cow specific miRNA regions. In practice, discovery of miRNA regions in a genome demands high-throughput data analysis, requiring large amount of processing. Considering this, miR-BAG has been developed in multi-threaded parallel architecture as a web server as well as a user friendly GUI standalone version. ER -