Hiring in life sciences? Share your open positions with our professional community. Read more Close

Advertisement

quantms-rescoring enables deep proteome coverage across protein quantification, immunopeptidomics, and post-translational modifications experiments.

Created on 13 Jan 2026

Authors

Dai, C., Gabriels, R., Bouwmeester, R., Larrea, A., Scheid, J., Webel, H., He, F., Martens, L., Kohlbacher, O., Bai, M., Xie, L., Sachsenberg, T., Perez-Riverol, Y.

Abstract

The growing volume of public proteomics datasets and the advent of novel machine learning (ML)-based methods create unprecedented opportunities for discovery through large-scale reanalysis. However, traditional desktop tools are increasingly insufficient for processing and integrating data at this scale. To address this challenge, we present a novel package, quantms-rescoring, that extends the cloud-native quantms workflow with a machine learning-based rescoring module. Unlike prior tools that rescore single-engine outputs, quantms-rescoring seamlessly integrates multiple search engines (SAGE, COMET, and MSGF+), performs automatic model selection, model fine-tuning, and scales reproducibly on cloud infrastructures. In quantms-rescoring, we rely on multiple fragment-ion intensity (AlphaPeptDeep and MS2PIP) and retention-time prediction (DeepLC) methods to improve results from multiple peptide database search engines. It features automatic model selection, fine-tuning, and retraining for MS/MS intensity and retention time prediction to select the best model for a given dataset. We applied the novel workflow to five representative datasets spanning DDA label-free quantification, TMT 10-plex isobaric labelling of tumor proteomics data, immunopeptidomics, phospho-proteomics, and unseen lysine malonylation experiments. We achieved a 16-22.8% increase in identified spectra, along with the quantification of 2191 additional phosphorylated peptides and 1337 phosphosites. In the tandem mass tag (TMT)-labeled clear cell renal cell carcinoma dataset, 76 novel differentially expressed multiple search engines identified proteins with quantms-rescoring. Additionally, novel 11,688 HLA-II potential binders were detected in the immunopeptidomics dataset by multiple search engines with quantms-rescoring. For unseen malonylation data, we reported more than 58.8% malonylation PSMs and 30.5% modification sites than COMET alone. Together, these results show that integrating multi-engine searches with machine learning-derived features can be combined in a scalable workflow that enhances identification, PTM localization, and quantification performance.

Preprint server: bioRxiv
The authors list and abstract were imported from bioRxiv on 13 Jan 2026.

Advertisement

Stats

  • Community rating n/a 0 votes
  • Your rating

1-terrible, 9-excellent. How would you rate this preprint? Sign in in to submit your rating.

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 20
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement