Authors
Xingyu Liao, Yanyan Li, Yingfu Wu, Long Wen, Minghui Jing, Bolin Chen, Xingyi Li, Xuequn Shang
Published in
ACS synthetic biology. Oct 16, 2025. Epub Oct 16, 2025.
Abstract
The accurate classification of Cas proteins is crucial for understanding CRISPR-Cas systems and developing genome-editing tools. Here, we present TEMC-Cas, a deep learning framework for accurate classification of Cas proteins that combines a finely tuned ESM protein language model with contrastive learning. Unlike traditional methods that rely on sequence similarity (e.g., BLAST, HMMs) or structural prediction, TEMC-Cas leverages evolutionary-scale modeling to capture distant homology while employing contrastive learning to distinguish closely related subtypes. The framework incorporates LoRA for efficient parameter adaptation and addresses class imbalance through weighted loss functions. TEMC-Cas achieves superior performance in classifying the Cas1-Cas13 families and 17 Cas12 subtypes, demonstrating particular strength in identifying remote homology. This approach provides a robust tool for the discovery of the CRISPR system and expands the toolbox for genome engineering applications. TEMC-Cas is now freely accessible at https://github.com/Xingyu-Liao/TEMC-Cas.
PMID:
41100703
Bibliographic data and abstract were imported from PubMed on 17 Oct 2025.
Read full publication at:
Please sign in
to see all details.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 37
- Comments 0