Outcome Vector Dependent Sampling with Longitudinal Continuous Response Data: Stratified Sampling Based on Summary Statistics
Summary The analysis of longitudinal trajectories usually focuses on evaluation of explanatory factors that are either associated with rates of change, or with overall mean levels of a continuous outcome variable. In this article, we introduce valid design and analysis methods that permit outcome dependent sampling of longitudinal data for scenarios where all outcome data currently exist, but a targeted substudy is being planned in order to collect additional key exposure information on a limited number of subjects. We propose a stratified sampling based on specific summaries of individual longitudinal trajectories, and we detail an ascertainment corrected maximum likelihood approach for estimation using the resulting biased sample of subjects. In addition, we demonstrate that the efficiency of an outcome-based sampling design relative to use of a simple random sample depends highly on the choice of outcome summary statistic used to direct sampling, and we show a natural link between the goals of the longitudinal regression model and corresponding desirable designs. Using data from the Childhood Asthma Management Program, where genetic information required retrospective ascertainment, we study a range of designs that examine lung function profiles over 4 years of follow-up for children classified according to their genotype for the IL 13 cytokine.