![]() |
CiteULike | ![]() |
zzb3886's CiteULike | ![]() |
![]() |
|
![]() |
Register | ![]() |
Log in | ![]() |
Context dependent class language model based on word co-occurrence matrix in LSA framework for speech recognitionIn ACS'08: Proceedings of the 8th conference on Applied computer scince (2008), pp. 275-280.
|
Reviews
[Write a review of this article]
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
Posting History
AbstractWe address the issue of data sparseness problem in language model (LM). Using class LM is one way to avoid this problem. In class LM, infrequent words are supported by more frequent words in the same class. This paper investigates a class LM based on LSA. A word-document matrix is usually used to represent a corpus in LSA framework. However, this matrix ignores word order in the sentence. We propose several word co-occurrence matrices that keep word order. Together with these matrices, we define a context dependent class (CDC) LM which distinguishes classes according to their context in the sentences. Experiments on Wall Street Journal (WSJ) corpus show that the word co-occurrence matrix works better than word-document matrix. Furthermore, the CDC achieves better perplexity than the traditional class LM based on LSA.
BibTeX record
RIS record