Modeling Taxa-Abundance Distributions in Microbial Communities using Environmental Sequence Data
We show that inferring the taxa-abundance distribution of a microbial community from small environmental samples alone is difficult. The difficulty stems from the disparity in scale between the number of genetic sequences that can be characterized and the number of individuals in communities that microbial ecologists aspire to describe. One solution is to calibrate and validate a mathematical model of microbial community assembly using the small samples and use the model to extrapolate to the taxa-abundance distribution for the population that is deemed to constitute a community. We demonstrate this approach by using a simple neutral community assembly model in which random immigrations, births, and deaths determine the relative abundance of taxa in a community. In doing so, we further develop a neutral theory to produce a taxa-abundance distribution for large communities that are typical of microbial communities. In addition, we highlight that the sampling uncertainties conspire to make the immigration rate calibrated on the basis of small samples very much higher than the true immigration rate. This scale dependence of model parameters is not unique to neutral theories; it is a generic problem in ecology that is particularly acute in microbial ecology. We argue that to overcome this, so that microbial ecologists can characterize large microbial communities from small samples, mathematical models that encapsulate sampling effects are required.