From de Bruijn Graphs to Rectangle Graphs for Genome Assembly Algorithms in Bioinformatics
edited by: Ben Raphael, Jijun Tang
Jigsaw puzzles were originally constructed by painting a picture on a rectangular piece of wood and further cutting it into smaller pieces with a jigsaw. The Jigsaw Puzzle Problem is to find an arrangement of these pieces that fills up the rectangle in such a way that neighboring pieces have “matching” boundaries with respect to color and texture. While the general Jigsaw Puzzle Problem is NP-complete , we discuss its simpler version (called Rectangle Puzzle Problem ) and study the rectangle graphs , recently introduced by Bankevich et al., 2012 , for assembling such puzzles. We establish the connection between Rectangle Puzzle Problem and the problem of assembling genomes from read-pairs, and further extend the analysis in  to real challenges encountered in applications of rectangle graphs in genome assembly. We demonstrate that addressing these challenges results in an assembler SPAdes+ that improves on existing assembly algorithms in the case of bacterial genomes (including particularly difficult case of genome assemblies from single cells). SPAdes+ is freely available from http://bioinf.spbau.ru/spades .