Topic-related web data collection was found effective for conversational speech language modeling. Topic-based keywords were generated first as queries to the search engines, then text normalization and perplexity filtering were applied.
Reviewed by
zzb3886
as

- 2008-09-06 01:43:50