The Impact of Large Language Models on Diagnostic Reasoning Among LLM-Trained Medical Doctors

Lahore University of Management Sciences

Status

Completed

Conditions

Diagnosis

Treatments

Other: ChatGPT-4o

Study type

Interventional

Funder types

Other

Identifiers

NCT06774612

IRB-0342

Details and patient eligibility

About

This study aims to evaluate whether large language model-trained medical doctors demonstrate enhanced diagnostic reasoning performance when utilizing ChatGPT-4o alongside conventional resources compared to using conventional resources alone.

Full description

Diagnostic errors are a major source of preventable patient harm. Recent advances in Large Language Models (LLM), particularly ChatGPT-4o, have shown promise in enhancing medical decision-making. However, little is known about their impact on medical doctors' (e.g., physicians' and surgeons') diagnostic reasoning.

Diagnostic accuracy relies on complex clinical reasoning and careful evaluation of patient data. While AI assistance could potentially reduce errors and improve efficiency, ChatGPT-4o lacks medical validation and could introduce new risks through incorrect information generation (also known as hallucinations). To mitigate these risks, doctors need adequate training in understanding ChatGPT-4o's capabilities, limitations, and proper usage. Given these uncertainties and the importance of proper AI training, systematic evaluation is essential before clinical implementation.

This randomized study will assess whether ChatGPT-4o access improves LLM-trained medical doctors' diagnostic performance compared to conventional resources (e.g., textbooks, online medical databases) alone. All participating doctors will have completed at least a 10-hour training program covering ChatGPT-4o usage, prompt engineering techniques, and output evaluation strategies. Participants will provide differential diagnoses with supporting evidence and recommended next steps for clinical cases, with responses evaluated by blinded reviewers.

Enrollment

60 patients

Sex

All

Volunteers

Accepts Healthy Volunteers

Inclusion criteria

Full or Provisionally Registered Medical Practitioners with the Pakistan Medical and Dental Council (PMDC).
Completed Bachelor of Medicine, Bachelor of Surgery (MBBS) Exam. The equivalent degree of MBBS in US and Canada is called Doctor of Medicine (MD).
Participants must have completed a structured training program on the use of ChatGPT (or a comparable large language model), totaling at least 10 hours of instruction. The program must include hands-on practice related to LLM's aspects, specifically prompt engineering and content evaluation.

Exclusion criteria

Any other Registered Medical Practitioners (Full or Provisional) with PMDC (e.g., Professionals with Bachelor of Dental Surgery or BDS).

Trial design

Primary purpose

Diagnostic

Allocation

Randomized

Interventional model

Parallel Assignment

Masking

None (Open label)

60 participants in 2 patient groups

ChatGPT-4o

Active Comparator group

Description:

Group will be given access to ChatGPT-4o.

Treatment:

Other: ChatGPT-4o

Conventional resources

No Intervention group

Description:

Group will not be given access to ChatGPT-4o but will be encouraged to use any resources they wish besides large language models (PubMed, Google without AI Overviews, etc).

Trial contacts and locations

Central trial contact

Ayesha Ali, PhD; Ihsan Ayyub Qazi, PhD

Data sourced from clinicaltrials.gov

Clinical trials

Find clinical trials Trials by location

Research sites

Find research sites Learn about CTV for professionals

Resources

Contact CTV support

Legal

Privacy Notice Terms