Evaluating pathogenicity of rare variants from dilated cardiomyopathy in the exome era.
Human exome sequencing is a recently developed tool to aid in the discovery of novel coding variants. Now broadly applied, exome sequencing data sets provide a novel opportunity to evaluate the allele frequencies of previously published pathogenic rare variants. We examined the exome data set from the National Heart, Lung and Blood Institute Exome Sequencing Project and compared this data set with a catalog of 197 previously published rare variants reported as causative of dilated cardiomyopathy (DCM) from familial and sporadic cases. Of these 197, 33 (16.8%) were also present in the Exome Sequencing Project database, raising the question of whether they were uncommon polymorphisms. Supporting functional data has been published for 14 of the 33 (42%), suggesting they are unlikely to be false-positives. The frequencies of these functional variants in the Exome Sequencing Project data set ranged from 0.02 to 1.33% (median 0.04%), which when applied as a cutoff to filter variants in a DCM pedigree identified an additional DCM candidate gene. A greater proportion of sporadic DCM cases had variants that were present in the Exome Sequencing Project data set versus novel variants (ie, not in the Exome Sequencing Project; 44% versus 21%; P=0.002), suggesting some of the variants identified as disease causing in sporadic DCM are either false-positives or low penetrance alleles in human populations. Rare nonsynonymous variants identified in DCM subjects also present at very low frequencies in public databases are likely relevant for DCM. Allele frequencies >0.04% are of less certain pathogenicity, especially if identified in sporadic cases, although this cutoff should be viewed as preliminary.