Estimating species number under an inconvenient abundance model
Estimating the number of species in a biological community based on a multinomial sample of individual organisms is a classical problem in statistical ecology. A central issue in parametric estimation is the specification of a model of the relative abundances of species given their number. A common approach to this problem is to assume that relative abundances follow a symmetric Dirichlet distribution. This is mathematically convenient but is unconnected to work by ecologists on abundance distributions in real communities. In this article we describe ML estimation based on the sequential broken stick model that has been proposed for abundances. This model is defined mechanistically, requiring that the likelihood be approximated numerically. For this to be feasible, the likelihood must be based on a small number of summary statistics. We present simulation results that show that the observed number of species and the observed number of species represented by a single individual is a reasonable set of summary statistics on which to base estimation. We apply the method to two published data sets, one involving insect species on Mount Kenya and the other involving spider species in an Appalachian forest.