CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Tags

One size does not fit all: On how Markov model order dictates performance of genomic sequence analyses

by: Leelavati Narlikar, Nidhi Mehta, Sanjeev Galande, Mihir Arjunwadkar
Nucleic Acids Research, Vol. 41, No. 3. (1 February 2013), pp. 1416-1424, doi:10.1093/nar/gks1285  Key: citeulike:12010932

Formatted Citation


Show HTML

Likes (beta)

This copy of the article hasn't been liked by anyone yet.

View FullText article


Abstract

The structural simplicity and ability to capture serial correlations make Markov models a popular modeling choice in several genomic analyses, such as identification of motifs, genes and regulatory elements. A critical, yet relatively unexplored, issue is the determination of the order of the Markov model. Most biological applications use a predetermined order for all data sets indiscriminately. Here, we show the vast variation in the performance of such applications with the order. To identify the ‘optimal’ order, we investigated two model selection criteria: Akaike information criterion and Bayesian information criterion (BIC). The BIC optimal order delivers the best performance for mammalian phylogeny reconstruction and motif discovery. Importantly, this order is different from orders typically used by many tools, suggesting that a simple additional step determining this order can significantly improve results. Further, we describe a novel classification approach based on BIC optimal Markov models to predict functionality of tissue-specific promoters. Our classifier discriminates between promoters active across 12 different tissues with remarkable accuracy, yielding 3 times the precision expected by chance. Application to the metagenomics problem of identifying the taxum from a short DNA fragment yields accuracies at least as high as the more complex mainstream methodologies, while retaining conceptual and computational simplicity.


TRHvidsten's tags for this article

Citations (CiTO)

No CiTO relationships defined

X There are no reviews yet

X Find related articles with these CiteULike tags

X Posting History


X Export records

Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.