Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes
Toward the goal of identifying complete sets of transcription factor (TF)-binding sites in the genomes of several gamma proteobacteria, and hence describing their transcription regulatory networks, we present a phylogenetic footprinting method for identifying these sites. Probable transcription regulatory sites upstream of Escherichia coli genes were identified by cross-species comparison using an extended Gibbs sampling algorithm. Close examination of a study set of 184 genes with documented transcription regulatory sites revealed that when orthologous data were available from at least two other gamma proteobacterial species, 81% of our predictions corresponded with the documented sites, and 67% corresponded when data from only one other species were available. That the remaining predictions included bona fide TF-binding sites was proven by affinity purification of a putative transcription factor (YijC) bound to such a site upstream of the fabA gene. Predicted regulatory sites for 2097 E.coli genes are available at http://www.wadsworth.org/resnres/bioinfo/.