A phylogenomic analysis of the Ascomycota
An automated procedure was developed to extract orthologous sequences from fungal genomes and incorporate them into phylogenomic analyses in a timely and efficient manner. This approach involves parsing an all versus all BLASTP search of 17 proteomes and creating a similarity matrix from e-values, which is then used to cluster proteins into related groups by means of a Markov Clustering algorithm. After performing this analysis at different stringency levels, 854 single copy protein clusters, which were ubiquitously distributed in all 17 proteomes, were identified. These clusters were culled to include only those clusters where all proteins had best hits to and received hits from a protein within the same cluster. The final data set included gapless alignments for 781 clusters of orthologous sequences that were concatenated into one super alignment containing 195,664 amino acid characters. Neighbor-joining distance and maximum likelihood analyses resulted in identical topologies and all except one node received 100% bootstrap support. The node supporting Stagonospora nodorum’s position received 83% support or higher; it was also the only taxon differentially resolved in the maximum parsimony analyses. All analyses resolved the two derived subphyla Pezizomycotina and Saccharomycotina, and Schizosaccharomyces pombe as an early diverging lineage of the Ascomycota. Importantly, these analyses resolved the Leotiomycetes as the sister group to the Sordariomycetes, a region of the Ascomycota phylogeny that has remained problematic in molecular phylogenetic studies of more limited character sampling. Additional phylogenetic analyses which included orthologous sequences from an unannotated ascomycotan genome (e.g., Coccidioides immitis) and subsets of orthologs with different characteristics supported this topology. Phylogenetic analyses of the 595 orthologs which included C. immitis resulted in an identical topology to the previous 781 ortholog analysis and correctly placed C. immitis in the Eurotiomycetes. This demonstrated the correct identification of orthologs and the ability to incorporate unannotated genomic data into a common phylogenetic analysis.