Non-parametric Bayesian approach to post-translational modification refinement of predictions from tandem mass spectrometry
Motivation: Tandem mass spectrometry (MS/MS) is a dominant approach for large-scale high-throughput post-translational modification (PTM) profiling. Although current state-of-the-art blind PTM spectral analysis algorithms can predict thousands of modified peptides (PTM predictions) in an MS/MS experiment, a significant percentage of these predictions have inaccurate modification mass estimates and false modification site assignments. This problem can be addressed by post-processing the PTM predictions with a PTM refinement algorithm. We developed a novel PTM refinement algorithm, iPTMClust, which extends a recently introduced PTM refinement algorithm PTMClust and uses a non-parametric Bayesian model to better account for uncertainties in the quantity and identity of PTMs in the input data. The use of this new modeling approach enables iPTMClust to provide a confidence score per modification site that allows fine-tuning and interpreting resulting PTM predictions.