CiteULike is a free online bibliography manager. Register and you can start organising your references online.

Comparing discriminant analysis, neural networks and logistic regression for predicting species distributions: a case study with a Himalayan river bird Export

Ecological Modelling, Vol. 120, No. 2-3. (17 August 1999), pp. 337-347.

Citation Format

[Posts]

View FullText article


dmajka's tags for this article

discriminant-analysis distribution-modeling glm neural-networks stats

X Reviews [Write a review of this article]

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History

X Abstract

We assessed the occurrence of a common river bird, the Plumbeous Redstart Rhyacornis fuliginosus, along 180 independent streams in the Indian and Nepali Himalaya. We then compared the performance of multiple discrimant analysis (MDA), logistic regression (LR) and artificial neural networks (ANN) in predicting this species' presence or absence from 32 variables describing stream altitude, slope, habitat structure, chemistry and invertebrate abundance. Using the entire data (=training set) and a threshold for accepting presence in ANN and LR set to P>=0.5, ANN correctly classified marginally more cases (88%) than either LR (83%) or MDA (84%). Model performance was assessed from two methods of data partitioning. In a [`]leave-one-out' approach, LR correctly predicted more cases (82%) than MDA (73%) or ANN (69%). However, in a holdout procedure, all the methods performed similarly (73-75%). All methods predicted true absence (i.e. specificity in holdout: 81-85%) better than true presence (i.e. sensitivity: 57-60%). These effects reflect species' prevalence (=frequency of occurrence), but are seldom considered in distribution modelling. Despite occurring at only 36% of the sites, Plumbeous Redstarts are one of the most common Himalayan river birds, and problems will be greater with less common species. Both LR and ANN require an arbitrary threshold probability (often P=0.5) at which to accept species presence from model prediction. Simulations involving varied prevalence revealed that LR was particularly sensitive to threshold effects. ROC plots (received operating characteristic) were therefore used to compare model performance on test data at a range of thresholds; LR always outperformed ANN. This case study supports the need to test species' distribution models with independent data, and to use a range of criteria in assessing model performance. ANN do not yet have major advantages over conventional multivariate methods for assessing bird distributions. LR and MDA were both more efficient in the use of computer time than ANN, and also more straightforward in providing testable hypotheses about environmental effects on occurrence. However, LR was apparently subject to chance significant effects from explanatory variables, emphasising the well-known risks of models based purely on correlative data.


X BibTeX record

X RIS record


Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.