LLM Performance in Endodontic Diagnostics

Marmara University

Status

Completed

Conditions

Endodontic Diagnosis, Endodontic Diseases, Endodontic Treatment, Endodontic Decision-making

Treatments

Diagnostic Test: AI-Based Diagnostic Assessment

Study type

Observational

Funder types

Other

Identifiers

NCT07281066

2025-38

Details and patient eligibility

About

The goal of this prospective observational study is to evaluate the ability of three large language models (ChatGPT-4o, Gemini Advanced, and Claude 3.7) to support diagnosis and treatment decision-making in adult patients presenting with common endodontic conditions.

The main questions the study aims to answer are:

Can LLMs accurately determine the endodontic diagnosis when provided with structured clinical information and periapical radiographs?

Can LLMs propose appropriate treatment plans comparable to decisions made by endodontic specialists?

To answer these questions, researchers will compare the diagnostic and treatment accuracy of three AI models using a consensus diagnosis from endodontic specialists as the reference standard.

Participants will:

Receive routine endodontic examination and periapical radiographs as part of standard clinical care.

Have their anonymized clinical histories and radiographs entered into the three AI models.

Not interact directly with any AI system; all evaluations will be performed by the research team.

This study aims to understand how large language models perform under real-world clinical conditions and whether these systems may play a supportive role in endodontic diagnostics in the future.

Full description

This prospective observational study aims to evaluate the real-time diagnostic and treatment decision-making performance of three large language models-ChatGPT-4o, Gemini Advanced, and Claude 3.7-in an endodontic clinical setting. A total of 120 patients presenting to the endodontic clinic were examined, and detailed medical/dental histories, clinical findings, and periapical radiographs were collected. Each anonymized case was then presented to the three LLMs using a standardized prompt asking for the diagnosis and the appropriate treatment plan.

All models were used in their default multimodal configurations without enabling web-search functions, plug-ins, or external data retrieval. Each question was submitted only once in isolated chat sessions to prevent memory carry-over. Responses were saved verbatim and compared with the reference diagnoses and treatment plans established by a panel of endodontic specialists.

This study was designed to mimic real-world clinical conditions as closely as possible, providing a realistic assessment of how these systems might perform when used by clinicians in everyday practice. Understanding their capabilities and limitations in authentic clinical scenarios is essential, as LLMs are expected to play an increasingly vital role in future dental care particularly in decision support, triage, and patient education. By identifying where these models perform well and where they fall short, this research aims to inform safe and effective clinical integration as LLM technologies continue to advance.

Enrollment

120 patients

Sex

All

Ages

18 to 65 years old

Volunteers

No Healthy Volunteers

Inclusion criteria

Adult patients (≥18 years old) presenting to or referred to the Endodontic Clinic.

Patients with a clinically verified endodontic condition requiring diagnosis and treatment planning.

Patients who agreed to participate and provided informed consent.

Patients for whom a complete paper-based medical/dental history and periapical radiograph were obtained during the clinical visit.

Exclusion criteria

Exclusion Criteria

Patients who declined participation or did not provide informed consent.

Pediatric patients (<18 years old) referred to the Pediatric Dentistry Clinic.

Patients attending the clinic with non-endodontic complaints (e.g., post-extraction alveolitis, third-molar extraction problems).

Cases with incomplete clinical information or missing radiographs.

Patients unable to undergo standard endodontic examination procedures.

Trial design

120 participants in 1 patient group

Endodontic Patients Cohort

Description:

This cohort includes 120 consecutive patients presenting to the endodontic clinic with clinically verified endodontic conditions. Clinical history and periapical radiographs were collected, and diagnostic/treatment recommendations generated by AI models were compared with expert consensus.

Treatment:

Diagnostic Test: AI-Based Diagnostic Assessment

Trial documents

Trial contacts and locations

Data sourced from clinicaltrials.gov

Clinical trials

Find clinical trials Trials by location

Research sites

Find research sites Learn about CTV for professionals

Resources

Contact CTV support

Legal

Privacy Notice Terms