The Composition and Origins of Genomic Variation among Individuals of the Soybean Reference Cultivar Williams 82
Soybean (Glycine max) is a self-pollinating species that has relatively low nucleotide polymorphism rates compared with other crop species. Despite the low rate of nucleotide polymorphisms, a wide range of heritable phenotypic variation exists. There is even evidence for heritable phenotypic variation among individuals within some cultivars. Williams 82, the soybean cultivar used to produce the reference genome sequence, was derived from backcrossing a Phytophthora root rot resistance locus from the donor parent Kingwa into the recurrent parent Williams. To explore the genetic basis of intracultivar variation, we investigated the nucleotide, structural, and gene content variation of different Williams 82 individuals. Williams 82 individuals exhibited variation in the number and size of introgressed Kingwa loci. In these regions of genomic heterogeneity, the reference Williams 82 genome sequence consists of a mosaic of Williams and Kingwa haplotypes. Genomic structural variation between Williams and Kingwa was maintained between the Williams 82 individuals within the regions of heterogeneity. Additionally, the regions of heterogeneity exhibited gene content differences between Williams 82 individuals. These findings show that genetic heterogeneity in Williams 82 primarily originated from the differential segregation of polymorphic chromosomal regions following the backcross and single-seed descent generations of the breeding process. We conclude that soybean haplotypes can possess a high rate of structural and gene content variation, and the impact of intracultivar genetic heterogeneity may be significant. This detailed characterization will be useful for interpreting soybean genomic data sets and highlights important considerations for research communities that are developing or utilizing a reference genome sequence.