CiteULike is a free online bibliography manager. Register and you can start organising your references online.

Dataset complexity can help to generate accurate ensembles of k-nearest neighbors Export

Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on (26 September 2008), pp. 450-457.

Citation Format

[Posts]

View FullText article


jjrodriguez's tags for this article

bioinformatics dataset_complexity ensemble nearest_neighbor

X Reviews [Write a review of this article]

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History

X Abstract

Gene expression based cancer classification using classifier ensembles is the main focus of this work. A new ensemble method is proposed that combines predictions of a small number of k-nearest neighbor (k-NN) classifiers with majority vote. Diversity of predictions is guaranteed by assigning a separate feature subset, randomly sampled from the original set of features, to each classifier. Accuracy of k-NNs is ensured by the statistically confirmed dependence between dataset complexity, determining how difficult is a dataset for classification, and classification error. Experiments carried out on three gene expression datasets containing different types of cancer show that our ensemble method is superior to 1) a single best classifier in the ensemble, 2) the nearest shrunken centroids method originally proposed for gene expression data, and 3) the traditional ensemble construction scheme that does not take into account dataset complexity.


X BibTeX record

X RIS record


Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.