Localizing Genes to Cerebellar Layers by Classifying ISH Images
Gene expression controls how the brain develops and functions. Understanding control processes in the brain is particularly hard since they involve numerous types of neurons and glia, and very little is known about which genes are expressed in which cells and brain layers. Here we describe an approach to detect genes whose expression is primarily localized to a specific brain layer and apply it to the mouse cerebellum. We learn typical spatial patterns of expression from a few markers that are known to be localized to specific layers, and use these patterns to predict localization for new genes. We analyze images of in-situ hybridization (ISH) experiments, which we represent using histograms of local binary patterns (LBP) and train image classifiers and gene classifiers for four layers of the cerebellum: the Purkinje, granular, molecular and white matter layer. On held-out data, the layer classifiers achieve accuracy above 94% (AUC) by representing each image at multiple scales and by combining multiple image scores into a single gene-level decision. When applied to the full mouse genome, the classifiers predict specific layer localization for hundreds of new genes in the Purkinje and granular layers. Many genes localized to the Purkinje layer are likely to be expressed in astrocytes, and many others are involved in lipid metabolism, possibly due to the unusual size of Purkinje cells. The way gene expression is spatially distributed across the brain reflects the function and micro-structure of neural tissues. Measuring these patterns is hard because brain tissues are composed of many types of neurons and glia cells, and average gene expression across a region mixes transcripts from many different cells. We present here an approach to identify genes that are primarily expressed in specific brain layers or cell types, based on analyzing high resolution in-situ hybridization images. By learning the spatial patterns of a few known cell markers, we annotate the expression patterns of hundreds of new genes, and predict the layers and cell types they are expressed in.