CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Tags

SPA: a short peptide assembler for metagenomic data

by: Youngik Yang, Shibu Yooseph
Nucleic Acids Research, Vol. 41, No. 8. (1 April 2013), pp. e91-e91, doi:10.1093/nar/gkt118  Key: citeulike:12073825

Formatted Citation


Show HTML

Likes (beta)

This copy of the article hasn't been liked by anyone yet.

View FullText article


Abstract

The metagenomic paradigm allows for an understanding of the metabolic and functional potential of microbes in a community via a study of their proteins. The substrate for protein identification is either the set of individual nucleotide reads generated from metagenomic samples or the set of contig sequences produced by assembling these reads. However, a read-based strategy using reads generated by next-generation sequencing (NGS) technologies, results in an overwhelming majority of partial-length protein predictions. A nucleotide assembly-based strategy does not fare much better, as metagenomic assemblies are typically fragmented and also leave a large fraction of reads unassembled. Here, we present a method for reconstructing complete protein sequences directly from NGS metagenomic data. Our framework is based on a novel short peptide assembler (SPA) that assembles protein sequences from their constituent peptide fragments identified on short reads. The SPA algorithm is based on informed traversals of a de Bruijn graph, defined on an amino acid alphabet, to identify probable paths that correspond to proteins. Using large simulated and real metagenomic data sets, we show that our method outperforms the alternate approach of identifying genes on nucleotide sequence assemblies and generates longer protein sequences that can be more effectively analysed.


accopeland's tags for this article

Citations (CiTO)

No CiTO relationships defined

X There are no reviews yet

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History


X Export records

Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.