Authors
Ekim Sağlam Gürmen, Mustafa Yorgancıoğlu, Abdurrahman Oral
Published in
Injury. Pages 113478. Jun 29, 2026. Epub Jun 29, 2026.
Abstract
Although large language models (LLMs) like ChatGPT are increasingly used in clinical reasoning, their reliability in procedural decision-making remains questionable. Wound management provides a standardized framework for evaluating their performance in practical scenarios. The primary goal of this study was to assess the level of agreement between ChatGPT's recommendations and emergency physicians' decisions in the management of traumatic lacerations.
The Emergency Department of Manisa Celal Bayar University Hafsa Sultan Hospital served as the site of this prospective, comparative study, conducted from February to May 2025. A cohort of 792 patients with traumatic lacerations was included. Five structured questions were presented to ChatGPT (GPT-4 model) to obtain information on suture material, technique, count, wound classification, and the necessity of antibiotics. Physicians' documented decisions were compared with their recommendations. Concordance was measured using Cohen's kappa (κ), and subgroup analyses were conducted by wound size and location.
The median age of the cohort was 28 years (IQR 16-40, range, 2-80), the proportion of males was 79.9%. The leading cause of injury was metal-related trauma, accounting for 47.7%. Across all decision domains, there was considerable to almost perfect agreement between ChatGPT and physicians (κ = 0.751-0.848; p < 0.001), with accuracy ranging from 82.6% to 91.7%. The most significant concordance was observed for antibiotic indication and wound classification. Agreement was consistently high across all anatomic sites, with higher agreement in wounds measuring 2.5 cm or more.
The study results suggest that ChatGPT aligns well with physician decision-making in wound management planning, further supporting its potential use as a clinical decision-support and educational tool in emergency medicine. AI systems, when properly validated, may enhance consistency and training, while ensuring physician oversight.
PMID:
42386488
Bibliographic data and abstract were imported from PubMed on 02 Jul 2026.
Read full publication at:
Please sign in
to see all details.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 1
- Comments 0