Fig_2.tif (734.12 kB)
Download file

Correlations between mRNA and protein levels vary widely and are systematically reduced by experimental noise.

Download (0 kB)
posted on 2015-05-07, 03:11 authored by Gábor Csárdi, Alexander Franks, David S. Choi, Edoardo M. Airoldi, D. Allan Drummond

A, Datasets vary widely in coverage of 5,887 yeast coding sequences and in resulting estimates of the mRNA–protein correlation. Shown are all pairwise correlations between 14 mRNA and 11 protein datasets, with within-study replicates averaged if present. Correlations are shown between mRNA and protein levels reported without correction (dots); using Spearman’s correction on pairs of datasets (binned, boxes show mean and bars indicate standard deviation); using Spearman’s correction on the largest set of paired measurements (red box); and as estimated by structured covariance modeling for 5,854 genes with a detected mRNA or protein (red diamond). B, Correlations obtained for the largest set of paired measurements, two of mRNA and two of protein levels (N = 3,418), computed individually, after averaging, and after correcting for noise using Spearman’s correction. C, Data are missing non-randomly. The distribution of protein levels, in molecules per cell, detected by western blotting [30] are shown, along with the subsets of these data corresponding to proteins detected by GFP-tagging and flow cytometry [39], LC MS/MS [75], and 2D gel [6]. D, Distribution of protein-level measurements, assessed by western blotting [30], with at least one protein-level measurement (dark gray, number of genes N1 = 3840) and in the subset of genes with at least 8 mRNA and 8 protein measurements (light gray, number of genes N8 = 549). E, mRNA–protein correlations between averaged mRNA and protein levels over subsets of at most 1, 2, 3, …, 8 measurements each of mRNA and protein levels drawn at random from the N8 set. Error bars show the standard deviation of correlations from 50 random samples of the indicated number of measurements.