Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene-induced Signaling
Cellular signal transduction generally involves cascades of post-translational protein modifications that rapidly catalyze changes in protein-DNA interactions and gene expression. High-throughput measurements are improving our ability to study each of these stages individually, but do not capture the connections between them. Here we present an approach for building a network of physical links among these data that can be used to prioritize targets for pharmacological intervention. Our method recovers the critical missing links between proteomic and transcriptional data by relating changes in chromatin accessibility to changes in expression and then uses these links to connect proteomic and transcriptome data. We applied our approach to integrate epigenomic, phosphoproteomic and transcriptome changes induced by the variant III mutation of the epidermal growth factor receptor (EGFRvIII) in a cell line model of glioblastoma multiforme (GBM). To test the relevance of the network, we used small molecules to target highly connected nodes implicated by the network model that were not detected by the experimental data in isolation and we found that a large fraction of these agents alter cell viability. Among these are two compounds, ICG-001, targeting CREB binding protein (CREBBP), and PKF118–310, targeting β-catenin (CTNNB1), which have not been tested previously for effectiveness against GBM. At the level of transcriptional regulation, we used chromatin immunoprecipitation sequencing (ChIP-Seq) to experimentally determine the genome-wide binding locations of p300, a transcriptional co-regulator highly connected in the network. Analysis of p300 target genes suggested its role in tumorigenesis. We propose that this general method, in which experimental measurements are used as constraints for building regulatory networks from the interactome while taking into account noise and missing data, should be applicable to a wide range of high-throughput datasets. The ways in which cells respond to changes in their environment are controlled by networks of physical links among the proteins and genes. The initial signal of a change in conditions rapidly passes through these networks from the cytoplasm to the nucleus, where it can lead to long-term alterations in cellular behavior by controlling the expression of genes. These cascades of signaling events underlie many normal biological processes. As a result, being able to map out how these networks change in disease can provide critical insights for new approaches to treatment. We present a computational method for reconstructing these networks by finding links between the rapid short-term changes in proteins and the longer-term changes in gene regulation. This method brings together systematic measurements of protein signaling, genome organization and transcription in the context of protein-protein and protein-DNA interactions. When used to analyze datasets from an oncogene expressing cell line model of human glioblastoma, our approach identifies key nodes that affect cell survival and functional transcriptional regulators.