Single nucleotide polymorphism discovery from wheat next-generation sequence data
Single nucleotide polymorphisms (SNPs) are the most abundant type of molecular genetic marker and can be used for producing high-resolution genetic maps, marker-trait association studies and marker-assisted breeding. Large polyploid genomes such as wheat present a challenge for SNP discovery because of the potential presence of multiple homoeologs for each gene. AutoSNPdb has been successfully applied to identify SNPs from Sanger sequence data for several species, including barley, rice and Brassica, but the volume of data required to accurately call SNPs in the complex genome of wheat has prevented its application to this important crop. DNA sequencing technology has been revolutionized by the introduction of next-generation sequencing, and it is now possible to generate several million sequence reads in a timely and cost-effective manner. We have produced wheat transcriptome sequence data using 454 sequencing technology and applied this for SNP discovery using a modified autoSNPdb method, which integrates SNP and gene annotation information with a graphical viewer. A total of 4 694 141 sequence reads from three bread wheat varieties were assembled to identify a total of 38 928 candidate SNPs. Each SNP is within an assembly complete with annotation, enabling the selection of polymorphism within genes of interest.