This paper examines several issues in the experimental design and empirical testing of classification models. As an illustration, we focus on the classification of commerical bank loans. We stree the importance of and the interaction anoung three elements: the loss function associated with classification errors, the algorithm used to discriminate among or predict classifications, and the method used to estimated the expected misclassification losses achieved by various algorithms.