A Network-based Approach for Predicting Missing Pathway Interactions
Embedded within large-scale protein interaction networks are signaling pathways that encode response cascades in the cell. Unfortunately, even for well-studied species like S. cerevisiae, only a fraction of all true protein interactions are known, which makes it difficult to reason about the exact flow of signals and the corresponding causal relations in the network. To help address this problem, we introduce a framework for predicting new interactions that aid connectivity between upstream proteins (sources) and downstream transcription factors (targets) of a particular pathway. Our algorithms attempt to globally minimize the distance between sources and targets by finding a small set of shortcut edges to add to the network. Unlike existing algorithms for predicting general protein interactions, by focusing on proteins involved in specific responses our approach homes-in on pathway-consistent interactions. We applied our method to extend pathways in osmotic stress response in yeast and identified several missing interactions, some of which are supported by published reports. We also performed experiments that support a novel interaction not previously reported. Our framework is general and may be applicable to edge prediction problems in other domains. Networks of protein interactions encode a variety of molecular processes occurring in the cell. Embedded within these networks are important subnetworks called signaling pathways. Pathways are initiated by upstream proteins (called sources) that receive signals from the environment and trigger a cascade of information to downstream proteins (targets). Modeling the interactions that occur within this cascade is important because pathway disruption has been linked to several diseases. Further, the interactions help us better understand how cells respond to various conditions and environments. Unfortunately, interaction networks today are largely incomplete, which makes this analysis difficult. We provide a framework to model missing interactions in pathways by searching for interactions that putatively result in quicker and more efficient source-target cascades. We find that we can substantially shorten source-target distances with only a few additional edges and that many of our predicted edges have support in several knowledge databases and literature reports. We believe our approach will be useful to identify interesting and important pathway-centric interactions that have been missed by previous experimental assays.