![]() |
CiteULike | ![]() |
ohkura's CiteULike | ![]() |
![]() |
|
![]() |
Register | ![]() |
Log in | ![]() |
A comparative study on feature selection in text categorizationedited by: Douglas H. FisherIn Proceedings of ICML-97, 14th International Conference on Machine Learning (1997), pp. 412-420.
|
Reviews
[Write a review of this article]
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
Posting History
AbstractThis paper is a comparative study of feature selection methods in statistical learning of text categorization. The focus is on aggressive dimensionality reduction. Five methods were evaluated, including term selection based on document frequency (DF), information gain (IG), mutual information (MI), a Ø 2 -test (CHI), and term strength (TS). We found IG and CHI most effective in our experiments. Using IG thresholding with a knearest neighbor classifier on the Reuters corpus, removal of up to...
BibTeX record
RIS record