Hieranoid: Hierarchical orthology inference
Accurate inference of orthologs is essential in many research fields such as comparative genomics, molecular evolution, and genome annotation. Existing methods for genome-scale orthology inference are mostly based on all-versus-all similarity searches which scale quadratically with the number of species. This limits their application to the increasing number of available large-scale datasets. Here, we present Hieranoid, a new orthology inference method using a hierarchical approach. Hieranoid performs pairwise orthology analysis using InParanoid at each node in a guide tree as it progresses from its leaves to the root. This concept reduces the total run-time complexity from a quadratic to a linear function of the number of species. The tree hierarchy provides a natural structure in multi-species ortholog groups, and the aggregation of multiple sequences allows for multiple alignment similarity searching techniques, which can yield more accurate ortholog groups. Using the recently published orthobench benchmark, Hieranoid showed the overall best performance. Our progressive approach presents a new way to infer orthologs that combines efficient graph-based methodology with aspects of compute-intensive tree-based methods. The linear scaling with the number of species is a major advantage for large-scale applications and makes Hieranoid well-suited to cope with the vast amounts of sequenced genomes in the future. Hieranoid is open-source and can be downloaded here: Hieranoid.sbc.su.se âº We applied the idea of progressive alignment algorithm to orthology prediction. âº Its based on progressively applying pairwise InParanoid along a guide tree. âº Our method scales linearly, as opposed to the quadratic scaling of most others. âº Benchmark results indicate that our new method is more accurate.