Hiring in life sciences? Share your open positions with our professional community. Read more Close

Advertisement

Single-Pass Discrete Diffusion Predicts High-Affinity Peptide Binders at >1,000 Sequences per Second across 150 Receptor Targets

Created on 18 Mar 2026

Authors

Watson, A.

Abstract

De novo peptide design methods traditionally couple generation to 3D structure prediction, limiting throughput to seconds or hours per candidate. Here we present LigandForge, a discrete diffusion model that generates binding peptide sequences in a single forward pass from receptor pocket geometry alone -- no structure prediction, inverse folding, or iterative refinement at inference. LigandForge produces over 700 sequences per second on a single GPU (peak >1,000), a throughput advantage exceeding 10,000-fold over BoltzGen and 1,000,000-fold over BindCraft. We generated 490,691 peptides across 150 receptor targets and validated 16,475 by Boltz-2 structure prediction. DeltaForge, a Rust-based thermodynamic scoring engine calibrated against experimental binding data (Pearson r = 0.83 on the PPB-Affinity peptide benchmark), identified predicted sub-100 nM binders across 85 of 116 scored targets (73%), sub-10 nM across 62 (53%), and sub-1 nM across 35 (30%). In a five-target head-to-head on historically difficult targets (TNF-, PD-L1, VEGF-A, IL-7R, HER2), LigandForge generated 150,000 candidates in 3.4 minutes and produced predicted sub-100 nM binders against all five targets (23 total from 576 folded structures), compared to 1 of 5 targets for BoltzGen (2 hits from 100 designs) and 0 for BindCraft (0 pipeline-accepted designs). DSSP analysis of 7,585 designed peptides revealed that LigandForge produces structurally diverse folds (45% helical, 28% {beta}-sheet) compared to the helix-dominated outputs of backbone-sampling methods (BoltzGen 73%, BindCraft 90% helical). LigandForge also generated peptides embedding within orthosteric pockets of aminergic GPCRs with no evolutionary precedent for peptide ligands, and natively targets heterodimeric and homomultimeric receptors including the CD8A-CD8B heterodimer (60.5% elite structural confidence, 19.5% simultaneous dual-chain engagement), the CD3D-CD3E signaling complex, and the KIT receptor tyrosine kinase homodimer in vacancy pairing mode (59% bivalent engagement, {Delta}G < -26 kcal/mol). These results demonstrate that thermodynamic knowledge compiled into model weights during training can replace iterative structure prediction at inference, enabling a paradigm shift from structure-dependent optimization of individual candidates to structure-free exploration of sequence space at scale -- with comparable or superior predicted binding quality, broader structural diversity, and access to target classes beyond the reach of backbone-sampling methods.

Preprint server: bioRxiv
The authors list and abstract were imported from bioRxiv on 18 Mar 2026.

Advertisement

Stats

  • Community rating n/a 0 votes
  • Your rating

1-terrible, 9-excellent. How would you rate this preprint? Sign in in to submit your rating.

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 66
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement