System Wide Analyses have Underestimated Protein Abundances and Transcriptional Importance in Animals
Proteome wide surveys in mammalian tissue culture cells suggest that the protein expressed at the median abundance is present at 8,000 - 16,000 molecules per cell. Comparisons of protein and mRNA abundances imply that differences in mRNA expression between genes explain only 10-40% of the differences in protein levels. We find, however, that the proteome wide surveys have significantly underestimated protein abundances. Using previously published individual measurements for 61 housekeeping proteins to rescale whole proteome data from Schwanhausser et al., we find that the median protein detected is expressed at 170,000 molecules per cell. We further find that our corrected protein abundance estimates show a higher correlation and a stronger linear relationship with mRNA abundances than do the uncorrected protein data. To estimate the degree to which mRNA expression levels determine protein levels, it is critical to determine the experimental errors in protein and mRNA abundance data and to consider all genes, not only those whose protein expression is readily detected. We estimate the measurement errors in data from Schwanhausser et al. and show that when these are taken into account mRNA levels explain at least 56% of the differences in protein abundance between the 4,212 genes detected. We also model protein expression levels in a cell for all genes and demonstrate using this data that mRNA levels now explain 92% of protein expression. As a result, we predict that translation rates vary much less between genes than implied by many studies. We show that this conclusion is supported by independent measurements of translation rates in tissue culture cells by Ingolia et al.