CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Tags

MetaVelvet: an extension of Velvet assembler to <i>de novo</i> metagenome assembly from short sequence reads

by: Toshiaki Namiki, Tsuyoshi Hachiya, Hideaki Tanaka, Yasubumi Sakakibara
In Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine (2011), pp. 116-124, doi:10.1145/2147805.2147818  Key: citeulike:10903988

Formatted Citation


Show HTML

Likes (beta)

This copy of the article hasn't been liked by anyone yet.

View FullText article


Abstract

Motivation: An important step of "metagenomics" analysis is the assembly of multiple genomes from mixed sequence reads of multiple species in a microbial community. Most conventional pipelines employ a single-genome assembler with carefully optimized parameters and post-process the resulting scaffolds to correct assembly errors. Limitations of the use of a single-genome assembler for de novo metagenome assembly are that highly conserved sequences shared between different species often causes chimera contigs, and sequences of highly abundant species are likely mis-identified as repeats in a single genome, resulting in a number of small fragmented scaffolds. The metagenome assembly problem becomes harder when assembling from very short sequence reads. Method: We modified and extended a single-genome and de Bruijn-graph based assembler, known as "Velvet" [27], for short reads to metagenome assembly, called "MetaVelvet", for mixed short reads of multiple species. Our fundamental ideas are first decomposing de Bruijn graph constructed from mixed short reads into individual sub-graphs and second building scaffolds based on every decomposed de Bruijn sub-graph as isolate species genome. We make use of two features, graph connectivity and coverage (abundance) difference, for the decomposition of de Bruijn graph. Results: On simulated datasets, MetaVelvet succeeded to generate higher N50 scores and smaller chimeric scaffolds than any compared single-genome assemblers, produce high-quality scaffolds as well as the separate assembly using Velvet from isolated species sequence reads, and MetaVelvet reconstructed even relatively low-coverage genome sequences as scaffolds. On a real dataset of Human Gut microbial read data, MetaVelvet produced longer scaffolds, increased the number of predicted genes, and improved the assignments of a phylum-level taxonomy in the sense that the rate of predicted genes that cannot be assigned to any tanoxomy is reduced. Availability: The source code of MetaVelvet is freely available at http://metavelvet.dna.bio.keio.ac.jp under the GNU General Public License.


jmeppley's tags for this article

Citations (CiTO)

No CiTO relationships defined

X There are no reviews yet

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History


X Export records

Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.