Sequence variation, common tissue expression patterns and learning models: a genome-wide survey of vertebrate ribosomal proteins
Ribosomal genes produce the constituents of the ribosome, one of the most conserved subcellular structures of all cells, from bacteria to eukaryotes, including animals. There are notions that some protein- coding ribosomal genes vary in their roles across species, particularly vertebrates, through the involvement of some in a number of genetic diseases. Based on extensive sequence comparisons and systematic curation, we establish a reference set for ribosomal proteins in eleven vertebrate species and quantify their sequence conservation levels. Moreover, we correlate their coordinated gene expression patterns and assess the exceptional role of paralogs in tissue specificity. Our results suggest that ribosomal proteins exhibit a complex relationship between their structure and function that broadly maintains a consistent expression landscape across tissues, while most of the variation arises from species idiosyncrasies and may be due to evolutionary change and adaptation, rather than functional constraints at the tissue level throughout the vertebrate lineage.