Hiring in life sciences? Share your open positions with our professional community. Read more Close

Advertisement

CDR-aware masked language models for pairedantibodies enable state-of-the-art bindingprediction

Created on 02 Nov 2025

Authors

Talaei, M., Walker, K. C., Hao, B., Jolley, E., Jin, Y., Kozakov, D., Misasi, J., Vajda, S., Paschalidis, I. C., Joseph-McCarthy, D.

Abstract

Antibodies are a leading class of biologics, yet their architecture with conserved framework regions and hypervariable complementarity-determining regions (CDRs) poses unique challenges for computational modeling. We present a region-aware pretraining strategy for paired heavy (VH) and light (VL) sequences in variable domains using ESM2-3B and ESM C (600M) protein language models. We compare three masking strategies: whole-chain, CDR-focused, and a hybrid approach. Through evaluation on binding affinity datasets spanning single-mutant panels and combinatorial mutants, we demonstrate that CDR-focused training produces superior embed-dings for functional prediction. Notably, training only on VH-VL pairs proves sufficient, eliminating the need for massive unpaired pretraining that provides no measurable downstream benefit. Our compact 600M ESM-C model achieves state-of-the-art performance, matching or exceeding larger antibody-specific baselines. These findings establish a principled framework for antibody language models: prioritize paired sequences with CDR-aware supervision over scale and complex training curricula to achieve both computational efficiency and predictive accuracy.

Preprint server: bioRxiv
The authors list and abstract were imported from bioRxiv on 02 Nov 2025.

Advertisement

Stats

  • Community rating n/a 0 votes
  • Your rating

1-terrible, 9-excellent. How would you rate this preprint? Sign in in to submit your rating.

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 51
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement