Pseudo semiparametric maximum likelihood estimation exploiting gene environment independence for population-based case–control studies with complex samples
Advances in human genetics have led to epidemiological investigations not only of the effects of genes alone but also of gene–environment (G–E) interaction. A widely accepted design strategy in the study of how G–E relate to disease risks is the population-based case–control study (PBCCS). For simple random samples, semiparametric methods for testing G–E have been developed by Chatterjee and Carroll in 2005. The use of complex sampling in PBCCS that involve differential probabilities of sample selection of cases and controls and possibly cluster sampling is becoming more common. Two complexities, weighting for selection probabilities and intracluster correlation of observations, are induced by the complex sampling. We develop pseudo-semiparametric maximum likelihood estimators (pseudo-SPMLE) that apply to PBCCS with complex sampling. We study the finite sample performance of the pseudo-SPMLE using simulations and illustrate the pseudo-SPMLE with a US case–control study of kidney cancer.