Prediction of Protein Binding Regions in Disordered Proteins
Many disordered proteins function via binding to a structured partner and undergo a disorder-to-order transition. The coupled folding and binding can confer several functional advantages such as the precise control of binding specificity without increased affinity. Additionally, the inherent flexibility allows the binding site to adopt various conformations and to bind to multiple partners. These features explain the prevalence of such binding elements in signaling and regulatory processes. In this work, we report ANCHOR, a method for the prediction of disordered binding regions. ANCHOR relies on the pairwise energy estimation approach that is the basis of IUPred, a previous general disorder prediction method. In order to predict disordered binding regions, we seek to identify segments that are in disordered regions, cannot form enough favorable intrachain interactions to fold on their own, and are likely to gain stabilizing energy by interacting with a globular protein partner. The performance of ANCHOR was found to be largely independent from the amino acid composition and adopted secondary structure. Longer binding sites generally were predicted to be segmented, in agreement with available experimentally characterized examples. Scanning several hundred proteomes showed that the occurrence of disordered binding sites increased with the complexity of the organisms even compared to disordered regions in general. Furthermore, the length distribution of binding sites was different from disordered protein regions in general and was dominated by shorter segments. These results underline the importance of disordered proteins and protein segments in establishing new binding regions. Due to their specific biophysical properties, disordered binding sites generally carry a robust sequence signal, and this signal is efficiently captured by our method. Through its generality, ANCHOR opens new ways to study the essential functional sites of disordered proteins. Intrinsically unstructured/disordered proteins (IUPs/IDPs) do not adopt a stable structure in isolation but exist as a highly flexible ensemble of conformations. Despite the lack of a well-defined structure these proteins carry out important functions. Many IUPs/IDPs function via binding specifically to other macromolecules that involves a disorder-to-order transition. The molecular recognition functions of IUPs/IDPs include regulatory and signaling interactions where binding to multiple partners and high-specificity/low-affinity interactions play a crucial role. Due to their specific functional and structural properties, these binding regions have distinct properties compared to both globular proteins and disordered regions in general. Here, we present a general method to identify disordered binding regions from the amino acid sequence. Our method targets the essential feature of these regions: they behave in a characteristically different manner in isolation than bound to their partner protein. This prediction method allows us to compare the binding properties of short and long binding sites. The evolutionary relationship between the amount of disordered binding regions and general disordered regions in various organisms was also analyzed. Our results suggest that disordered binding regions can be recognized even without taking into account their adopted secondary structure or their specific binding partner.