CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Tags

A Trigram HMM-Based POS Tagger for Indian Languages

by: Kamal Sarkar, Vivekananda Gayen

edited by: Suresh C. Satapathy, Siba K. Udgata, Bhabendra N. Biswal

In Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA), Vol. 199 (2013), pp. 205-212, doi:10.1007/978-3-642-35314-7_24  Key: citeulike:11868839

Formatted Citation


Show HTML

Likes (beta)

This copy of the article hasn't been liked by anyone yet.

View FullText article


Abstract

We present in this paper a trigram HMM-based (Hidden Markov Model) part-of-speech (POS) tagger for Indian languages, which will accept a raw text in an Indian language (typed in corresponding language font) to produce a POS tagged output. We implement the trigram POS Tagger from the scratch based on the second order Hidden Markov Model (HMM). For handling unknown words, we introduce a prefix analysis method and a word-type analysis method which are combined with the well known suffix analysis method for predicting the probable tags. Though our developed systems have been tested on the data for four Indian languages namely Bengali, Hindi, Marathi and Telugu, the developed system can be easily ported to a new language just by replacing the training file with the POS tagged data for the new language. Our developed trigram POS tagger has been compared to the bigram POS tagger defined as a baseline.


jukamal2012's tags for this article

Citations (CiTO)

No CiTO relationships defined

X There are no reviews yet

X Posting History


X Export records

Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.