CiteULike is a free online bibliography manager. Register and you can start organising your references online.

Learning Rules that Classify E-Mail Export

In In Papers from the AAAI Spring Symposium on Machine Learning in Information Access (1996), pp. 18-25.

Citation Format

[Posts]

View FullText article


dmnapolitano's tags for this article

classification email rule-based stroz-friedberg tf-idf

X Reviews [Write a review of this article]

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History

X Abstract

Two methods for learning text classifiers are compared on classification problems that might arise in filtering and filing personal e-mail messages: a "traditional IR" method based on TF-IDF weighting, and a new method for learning sets of "keyword-spotting rules" based on the RIPPER rule learning algorithm. It is demonstrated that both methods obtain significant generalizations from a small number of examples; that both methods are comparable in generalization performance on problems of this type; and that both methods are reasonably efficient, even with fairly large training sets. However, the greater comprehensibility of the rules may be advantageous in a system that allows users to extend or otherwise modify a learned classifier. Introduction Perhaps the most-discussed technical phenomenon of recent years has been the rapid growth of the Internet---or more generally, the rapid growth in the number of on-line documents. This has led to increased interest in intelligent methods for ...


X BibTeX record

X RIS record


Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.