SNP Discovery with EST and NextGen Sequencing in Switchgrass (Panicum virgatum L.).
Although yield trials for switchgrass (Panicum virgatum L.), a potentially high value biofuel feedstock crop, are currently underway throughout North America, the genetic tools for crop improvement in this species are still in the early stages of development. Identification of high-density molecular markers, such as single nucleotide polymorphisms (SNPs), that are amenable to high-throughput genotyping approaches, is the first step in a quantitative genetics study of this model biofuel crop species. We generated and sequenced expressed sequence tag (EST) libraries from thirteen diverse switchgrass cultivars representing both upland and lowland ecotypes, as well as tetraploid and octoploid genomes. We followed this with reduced genomic library preparation and massively parallel sequencing of the same samples using the Illumina Genome Analyzer technology platform. EST libraries were used to generate unigene clusters and establish a gene-space reference sequence, thus providing a framework for assembly of the short sequence reads. SNPs were identified utilizing these scaffolds. We used a custom software program for alignment and SNP detection and identified over 149,000 SNPs across the 13 short-read sequencing libraries (SRSLs). Approximately 25,000 additional SNPs were identified from the entire EST collection available for the species. This sequencing effort generated data that are suitable for marker development and for estimation of population genetic parameters, such as nucleotide diversity and linkage disequilibrium. Based on these data, we assessed the feasibility of genome wide association mapping and genomic selection applications in switchgrass. Overall, the SNP markers discovered in this study will help facilitate quantitative genetics experiments and greatly enhance breeding efforts that target improvement of key biofuel traits and development of new switchgrass cultivars.