CiteULike is a free online bibliography manager. Register and you can start organising your references online.

Unsupervised learning of natural languages Export

edited by: James L. Mcclelland

Proceedings of the National Academy of Sciences, Vol. 102, No. 33. (16 August 2005), pp. 11629-11634.

Citation Format

[Posts]

View FullText article


nettraq's tags for this article

algorithms bioinformatics linguistics

X Reviews [Write a review of this article]

X Notes for this article

nettraq has 0 private notes and 1 public note for this article.

Full text of paper available at http://www.cs.tau.ac.il/~ruppin/pnas_adios.pdf

nettraq (public note) - 2005-09-01 19:02:30

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History

X Abstract

We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The ADIOS (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics.


X BibTeX record

X RIS record


Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.