CiteULike is a free online bibliography manager. Register and you can start organising your references online.

On finding minimal absent words Export

BMC Bioinformatics, Vol. 10 (08 May 2009), 137.

Citation Format

[Posts]

View FullText article


X Reviews [Write a review of this article]

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History

X Abstract

Background: The problem of finding the shortest absent words in DNA data has been recently addressed, and algorithms for its solution have been described. It has been noted that longer absent words might also be of interest, but the existing algorithms only provide generic absent words by trivially extending the shortest ones. Results: We show how absent words relate to the repetitions and structure of the data, and define a new and larger class of absent words, called minimal absent words, that still captures the essential properties of the shortest absent words introduced in recent works. The words of this new class are minimal in the sense that if their leftmost or rightmost character is removed, then the resulting word is no longer an absent word. We describe an algorithm for generating minimal absent words that, in practice, runs in approximately linear time. An implementation of this algorithm is publicly available at ftp://www.ieeta.pt/~ap/maws. Conclusions: Because the set of minimal absent words that we propose is much larger than the set of the shortest absent words, it is potentially more useful for applications that require a richer variety of absent words. Nevertheless, the number of minimal absent words is still manageable since it grows at most linearly with the string size, unlike generic absent words that grow exponentially. Both the algorithm and the concepts upon which it depends shed additional light on the structure of absent words and complement the existing studies on the topic.


X BibTeX record

X RIS record


Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.