Enhanced Word Classing for Model M
Model M is a superior class-based n-gram model that has shown improvements on a variety of tasks and domains. In previous work with Model M, bigram mutual information clustering has been used to derive word classes. In this paper, we introduce a new word classing method designed to closely match with Model M. The proposed classing technique achieves gains in speech recognition word-error rate of up to 1.1 % absolute over the baseline clustering, and a total gain of up to 3.0 % absolute over a Katz-smoothed trigram model, the largest such gain ever reported for a class-based language model. 1.