CiteULike is a free online bibliography manager. Register and you can start organising your references online.

Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification Export

(01 July 2005)

Citation Format

[Posts]

View FullText article


shashikant's tags for this article

classify filter spam statistic usenet

X Reviews [Write a review of this article]

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History

X Abstract

Join author John Zdziarski for a look inside the brilliant minds that have conceived clever new ways to fight spam in all its nefarious forms. This landmark title describes, in-depth, how statistical filtering is being used by next-generation spam filters to identify and filter unwanted messages, how spam filtering works and how language classification and machine learning combine to produce remarkably accurate spam filters. <p> After reading <i>Ending Spam</i>, you'll have a complete understanding of the mathematical approaches used by today's spam filters as well as decoding, tokenization, various algorithms (including Bayesian analysis and Markovian discrimination) and the benefits of using open-source solutions to end spam. Zdziarski interviewed creators of many of the best spam filters and has included their insights in this revealing examination of the anti-spam crusade. </p> <p> If you're a programmer designing a new spam filter, a network admin implementing a spam-filtering solution, or just someone who's curious about how spam filters work and the tactics spammers use to evade them, <i>Ending Spam</i> will serve as an informative analysis of the war against spammers.</p> <p> TOC Introduction</p> <p> PART I: An Introduction to Spam Filtering Chapter 1: The History of Spam Chapter 2: Historical Approaches to Fighting Spam Chapter 3: Language Classification Concepts Chapter 4: Statistical Filtering Fundamentals</p> <p> PART II: Fundamentals of Statistical Filtering Chapter 5: Decoding: Uncombobulating Messages Chapter 6: Tokenization: The Building Blocks of Spam Chapter 7: The Low-Down Dirty Tricks of Spammers Chapter 8: Data Storage for a Zillion Records Chapter 9: Scaling in Large Environments</p> <p> PART III: Advanced Concepts of Statistical Filtering Chapter 10: Testing Theory Chapter 11: Concept Identification: Advanced Tokenization Chapter 12: Fifth-Order Markovian Discrimination Chapter 13: Intelligent Feature Set Reduction Chapter 14: Collaborative Algorithms</p> <p> Appendix: Shining Examples of Filtering</p> <p> Index</p>


X BibTeX record

X RIS record


Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.