Accurate Prediction of Peptide Binding Sites on Protein Surfaces
Many important protein–protein interactions are mediated by the binding of a short peptide stretch in one protein to a large globular segment in another. Recent efforts have provided hundreds of examples of new peptides binding to proteins for which a three-dimensional structure is available (either known experimentally or readily modeled) but where no structure of the protein–peptide complex is known. To address this gap, we present an approach that can accurately predict peptide binding sites on protein surfaces. For peptides known to bind a particular protein, the method predicts binding sites with great accuracy, and the specificity of the approach means that it can also be used to predict whether or not a putative or predicted peptide partner will bind. We used known protein–peptide complexes to derive preferences, in the form of spatial position specific scoring matrices, which describe the binding-site environment in globular proteins for each type of amino acid in bound peptides. We then scan the surface of a putative binding protein for sites for each of the amino acids present in a peptide partner and search for combinations of high-scoring amino acid sites that satisfy constraints deduced from the peptide sequence. The method performed well in a benchmark and largely agreed with experimental data mapping binding sites for several recently discovered interactions mediated by peptides, including RG-rich proteins with SMN domains, Epstein-Barr virus LMP1 with TRADD domains, DBC1 with Sir2, and the Ago hook with Argonaute PIWI domain. The method, and associated statistics, is an excellent tool for predicting and studying binding sites for newly discovered peptides mediating critical events in biology. An important class of protein interactions in critical cellular processes, such as signaling pathways, involves a domain from one protein binding to a linear peptide stretch of another. Many methods identify peptides mediating such interactions but without details of how the interactions occur, even when excellent structural information is available for the unbound protein. Experimental studies are currently time consuming, while existing computational methods to predict protein–peptide structures mostly focus on interactions involving specific protein families. Here, we present a general approach for predicting protein–peptide interaction sites. We show that spatial atomic position specific scoring matrices of binding sites for each peptide residue can capture the properties important for binding and when used to scan the surface of target proteins can accurately identify candidate binding sites for interacting peptides. The resulting predictions are highly illuminating for several recently described protein–peptide complexes, including RG-rich peptides with SMN domains, the Epstein-Barr virus LMP1 with TRADD domains, DBC1 with Sir2, and the Ago hook with the Argonaute PIWI domain. The accurate prediction of protein–peptide binding without prior structural knowledge will ultimately enable better functional characterization of many protein interactions involved in vital biological processes and provide a better picture of cellular mechanisms.