Comparative motif discovery combined with comparative transcriptomics yields accurate targetome and enhancer predictions
The identification of transcription factor binding sites, enhancers, and transcriptional target genes often relies on the integration of gene expression profiling and computational cis-regulatory sequence analysis. Methods for the prediction of cis-regulatory elements can take advantage of comparative genomics to increase signal-to-noise levels. However, gene expression data are usually derived from only one species. Here we investigate tissue-specific cross-species gene expression profiling by high-throughput sequencing, combined with cross-species motif discovery. First, we compared different methods for expression level quantification and cross-species integration using Tag-seq data. Using the optimal pipeline, we derived a set of genes with conserved expression during retinal determination across Drosophila melanogaster, Drosophila yakuba, and Drosophila virilis. These genes are enriched for binding sites of eye-related transcription factors including the zinc-finger Glass, a master regulator of photoreceptor differentiation. Validation of predicted Glass targets using RNA-seq in homozygous glass mutants confirms that the majority of our predictions are expressed downstream from Glass. Finally, we tested nine candidate enhancers by in vivo reporter assays and found eight of them to drive GFP in the eye disc, of which seven colocalize with the Glass protein, namely, scrt, chp, dpr10, CG6329, retn, Lim3, and dmrt99B. In conclusion, we show for the first time the combined use of cross-species expression profiling with cross-species motif discovery as a method to define a core developmental program, and we augment the candidate Glass targetome from a single known target gene, lozenge, to at least 62 conserved transcriptional targets.