figshare
Browse
krnb_a_1387709_sm6568.xlsx (18.77 kB)

A comparative study of sequence- and structure-based features of small RNAs and other RNAs of bacteria

Download (18.77 kB)
dataset
posted on 2017-11-03, 13:14 authored by Amita Barik, Santasabuj Das

Small RNAs (sRNAs) in bacteria have emerged as key players in transcriptional and post-transcriptional regulation of gene expression. Here, we present a statistical analysis of different sequence- and structure-related features of bacterial sRNAs to identify the descriptors that could discriminate sRNAs from other bacterial RNAs. We investigated a comprehensive and heterogeneous collection of 816 sRNAs, identified by northern blotting across 33 bacterial species and compared their various features with other classes of bacterial RNAs, such as tRNAs, rRNAs and mRNAs. We observed that sRNAs differed significantly from the rest with respect to G+C composition, normalized minimum free energy of folding, motif frequency and several RNA-folding parameters like base-pairing propensity, Shannon entropy and base-pair distance. Based on the selected features, we developed a predictive model using Random Forests (RF) method to classify the above four classes of RNAs. Our model displayed an overall predictive accuracy of 89.5%. These findings would help to differentiate bacterial sRNAs from other RNAs and further promote prediction of novel sRNAs in different bacterial species.

Funding

This project was supported by Indian Council of Medical Research [extramural project (IRIS ID: 2013–1551G)].

History