![]() |
CiteULike | ![]() |
mote's CiteULike | ![]() |
![]() |
|
![]() |
Register | ![]() |
Log in | ![]() |
Predicting Oral Reading Miscues |
Reviews
[Write a review of this article]
Notes for this articleSystem to generate nbest list for likely child reading mispronunciations.
Compare two different methods ("rote" and "extrapolative").
Training data : Colorado DB (Olson et al) of 112k transcribed child miscues (vocabulary 881 distinct words)
Rote method selects miscues that have actually occured in the past (avg. 34.2 per word, pared down to 7.4 by looking at miscues that more than one student made).
Extrapolative uses machine learning to extract other miscue words that are similar in pronunciation to the target words. Used a number of features (primarily a modified edit distance on pronunciation (more on that in another note).
Results: as would be expected, rote works well in very common words (lots of data) and extrapolative works well for uncommon words.
One big shortcoming was that miscues in extrapolative method seemed to necessarily be actual words in the dictionary. And the miscues were limitted to be words that started with the same phone as target. That last one seems like a mistake, because as they compare utility of different features in training the extrapolative model, the "same-first-phoneme" feature, while useful, scored rather low.
They use a neat modified levenshtein (edit) distance: - 0-2 point penalty for substituting similar phones - 5 point penalty for substituting non-similar phonemes - unspecified penalty for insertion/deletion. And all normalized by phones in target word.
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
Posting History
BibTeX record
RIS record