Enabling proteomic studies with RNA-Seq: The proteome of tomato pollen as a test case
Effective proteome profiling is generally considered to depend heavily on the availability of a high-quality DNA reference database. As such, proteomics has long been taxonomically restricted, with limited inroads being made into the proteomes of “non-model” organisms. However, next generation sequencing (NGS), and particularly RNA-Seq, now allows deep coverage detection of expressed genes at low cost, which in turn potentially facilitates the matching of peptide mass spectra with cognate gene sequence. To test this, we performed a quantitative analysis of the proteomes of pollen from domesticated tomato (Solanum lycopersicum) and two wild relatives that exhibit differences in mating systems and in interspecific reproductive barriers. Using a custom tomato RNA-Seq database created through 454 pyrosequencing, more than 1200 proteins were identified, with subsets showing expression differences between genotypes or in the accumulation of the corresponding transcripts. Importantly, no major qualitative or quantitative differences were observed in the characterized proteomes when mass spectra were used to interrogate either a highly curated community database of tomato sequences generated through traditional sequencing technologies, or the RNA-Seq database. We conclude that RNA-Seq provides a cost-effective and robust platform for protein identification and will be increasingly valuable to the field of proteomics.