Authors
Dong-Xu Cui, Shi-Yu Long, Yi-Xuan Tang, Yue Zhao, Qiao Li
Published in
Journal of chemical information and modeling. Aug 25, 2025. Epub Aug 25, 2025.
Abstract
This study presents a systematic evaluation of five reasoning-enhanced Large Language Models (LLMs)─Deepseek-R1-0528, OpenAI-o4 mini, Gemini-2.5-pro, doubao-seed-1.6-thinking, and qwen-max-latest─across nine key chemistry tasks. By comparing these models with traditional LLMs and established computational tools, we systematically investigate the influence of reasoning capabilities and prompt engineering on chemical cognition. The results demonstrate that reasoning-enabled LLMs achieve significant performance improvements in fundamental tasks and that, in most cases, overly complex prompts are not beneficial for these models. However, domain-specific limitations persist; for instance, all five models exhibited structural inaccuracies in CIF file generation (such as incorrect bond topologies). Notably, while reasoning frameworks enhance logical coherence, they do not fundamentally resolve challenges in stereochemical identification or the recognition of rare symmetry groups. In essence, the spatial recognition capabilities of current Large Language Models remain insufficient. These findings underscore the necessity of developing domain-optimized training paradigms to bridge the gap between general reasoning capabilities and specialized chemical applications.
PMID:
40854079
Bibliographic data and abstract were imported from PubMed on 26 Aug 2025.
Read full publication at:
Please sign in
to see all details.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 43
- Comments 0