Hiring in life sciences? Share your open positions with our professional community. Read more Close

Advertisement

Decoding enzymatic landscapes: a knowledge graph-enhanced large language model framework for microbial enzyme production and catalysis systems.

Created on 29 Jun 2026

Authors

Qichang Tong, Lincong Zhou, Xu Liu, Xiaoqing Liu, Ningfeng Wu, Yuan Wang, Huiying Luo, Bin Yao, Jian Tian, Dongfei Han, Xianghua Yan, Feifei Guan

Published in

aBIOTECH. Volume 7. Issue 3. Pages 100059. Epub May 30, 2026.

Abstract

Microbial enzyme production and catalysis systems are crucial aspect of biotechnological research. However, building them from trustworthy published experimental data presents a major obstacle for both manual and automated techniques. Here, we introduce MEPAM (Microbial Enzyme Production and Catalytic Activity based on LLM), a question-answering system designed to accurately address inquiries related to enzyme production and catalytic reactions. Specifically, by training three machine learning models with >0.98 accuracy, we identified 11,068 high-quality, relevant articles from the Web of Science. Leveraging DeepSeek-V3 with zero-shot learning, we developed an ontology-driven knowledge representation that extracted 12,434 entities and 35,918 relations with 0.78 extraction accuracy and constructed a structured knowledge graph. Compared to few-shot learning and other machine learning methods, our framework achieved significantly higher extraction accuracy. Using this framework, we developed MEPAM based on retrieval-augmented generation and prompt engineering. Finally, using MEPAM, we extracted a comprehensive network involving the expression profiles, precise culture conditions, and substrate preferences for cellulase, demonstrating the strong utility of this tool. Compared with traditional LLMs, particularly GPT-4o, MEPAM exhibited superior performance, achieving significantly higher answer accuracy (0.86 vs. 0.52) and nearly eliminating hallucinations. MEPAM is available at http://180.76.108.212. This framework provides context-rich, verifiable insights, thus bridging predictive modeling with experimental validation to facilitate the exploration of microbial enzymatic systems.

PMID:
42371480
Bibliographic data and abstract were imported from PubMed on 29 Jun 2026.

Read full publication at:
Please sign in to see all details.

Advertisement

Stats

  • Community rating n/a 0 votes
  • Reviewers' rating n/a 0 votes
  • Your rating

1-terrible, 9-excellent. How would you rate this publication? Sign in in to submit your rating.

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 19
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement