Hiring in life sciences? Share your open positions with our professional community. Read more Close

Advertisement

Assessing the Utility of Large Language Models in Guiding Dental Practitioners on Pediatric Patient Care: A Comparative AI Study.

Created on 09 Oct 2025

Authors

Mithula Raj, Vignesh Ravindran, Abirami Arthanari

Published in

Journal of clinical and experimental dentistry. Volume 17. Issue 9. Pages e1099-e1107. Epub Sep 01, 2025.

Abstract

Large Language Models (LLMs) are transforming clinical decision-making by offering rapid, context-aware access to evidence-based knowledge. However, their efficacy in pediatric dentistry remains underexplored, especially across multiple LLM platforms.Objective: To comparatively evaluate the clinical quality, readability, and originality of responses generated by nine contemporary LLMs for pediatric dental queries.
A cross-sectional study assessed the performance of ChatGPT-3.5, ChatGPT-4o, Gemini 2.0, Gemini 2.5, Claude 3.5 Haiku, Claude 3.7 Sonnet, Grok-3, Grok-3 Mini, and DeepSeek-V3. Twenty pediatric dental questions were posed in one-shot queries to each LLM. Responses were evaluated by ten pediatric dental experts using the Modified Global Quality Scale (MGQS), Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), and Turnitin Similarity Index. ANOVA and Cohen's Kappa were used for statistical analysis.
ChatGPT-4o demonstrated the highest overall MGQS (4.28 ± 0.24), followed by ChatGPT-3.5 (3.45 ± 0.27). DeepSeek-V3 scored lowest (2.18 ± 0.19). Topic-wise, ChatGPT-4o consistently outperformed others across all subdomains. FRES and FKGL scores indicated moderate readability, with Claude models exhibiting the highest linguistic complexity. Turnitin analysis revealed low-to-moderate similarity across models. Inter-rater agreement was substantial (κ = 0.78).
Among evaluated LLMs, ChatGPT-4o exhibited superior performance in clinical relevance, coherence, and originality, suggesting its potential utility as an adjunct in pediatric dental decision-making. Nonetheless, variability across models underscores the need for critical appraisal and cautious integration into clinical workflows. Key words:Artificial Intelligence, Clinical decision support, Health Communication, Large language models, Natural Language Processing.

PMID:
41064784
Bibliographic data and abstract were imported from PubMed on 09 Oct 2025.

Read full publication at:
Please sign in to see all details.

Advertisement

Stats

  • Community rating n/a 0 votes
  • Reviewers' rating n/a 0 votes
  • Your rating

1-terrible, 9-excellent. How would you rate this publication? Sign in in to submit your rating.

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 48
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement