A joint investigation of misclassification treatments and imbalanced datasets on neural network performance
Two important factors that impact a classification model’s performance are imbalanced data and unequal misclassification cost consequences. These are especially important considerations for neural network models developed to estimate the posterior probabilities of group membership used in classification decisions. This paper explores the issues of asymmetric misclassification costs and unbalanced group sizes on neural network classification performance using an artificial data approach that is capable of generating more complex datasets than used in prior studies and which adds new insights to the problem and the results. A different performance measure, that is capable of directly measuring classification performance consistency with Bayes decision rule, is used. The results show that both asymmetric misclassification costs and imbalanced group sizes have significant effects on neural network classification performance both independently and via interaction effects. These are not always intuitive; they supplement prior findings, and raise issues for the future.