Prediction of ligand-induced structural polymorphism of receptor interaction sites using Machine Learning.
Protein functions are closely related to their three-dimensional structures. Various degrees of conformational changes in the main and side chains occur when binding with other molecules, such as small ligands or proteins. The ligand-induced structural polymorphism of proteins is also referred to as "induced-fit", and it plays an important role in the recognition of a particular class of ligands as well as in signal transduction. We have developed new prediction models that discriminate conformationally fluctuant residues caused by ligand-binding. The training and test datasets were obtained from the Protein Data Bank. The induced-fit residues were judged based on the Z values of the Cα atom distances in each protein cluster. Moreover, we introduced various descriptors, such as the number of residues, the accessible surface area (ASA), the depth of the residue, and the Position-Specific Scoring Matrix (PSSM), which were obtained from the 2D- or 3D-structural information for the protein. After the optimization of the parameters by 5-fold cross validation, the best prediction model was applied to some well-known induced-fit target proteins, to verify its effectiveness. Especially in the validation for the DFG motif of a protein kinase family, we succeeded in the prediction of the DFG-out possibility from only the DFG-in conformation of each kinase structure.