Physician Reasoning on Diagnostic Cases With Large Language Models

Stanford University

Status

Completed

Conditions

Diagnosis

Treatments

Other: GPT-4

Study type

Interventional

Funder types

Other

Identifiers

NCT06157944

71319

Details and patient eligibility

About

This study will evaluate the effect of providing access to GPT-4, a large language model, compared to traditional diagnostic decision support tools on performance on case-based diagnostic reasoning tasks.

Full description

Artificial intelligence (AI) technologies, specifically advanced large language models like OpenAI's ChatGPT, have the potential to improve medical decision-making. Although ChatGPT-4 was not developed for its use in medical-specific applications, it has demonstrated promise in various healthcare contexts, including medical note-writing, addressing patient inquiries, and facilitating medical consultation. However, little is known about how ChatGPT augments the clinical reasoning abilities of clinicians.

Clinical reasoning is a complex process involving pattern recognition, knowledge application, and probabilistic reasoning. Integrating AI tools like ChatGPT-4 into physician workflows could potentially help reduce clinician workload and decrease the likelihood of missed diagnoses. However, ChatGPT-4 was not developed for the purpose of clinical reasoning nor has it been validated for this purpose. Further, it may be subject to disinformation, including convincing confabulations that may mislead clinicians. If clinicians misuse this tool, it may not improve diagnostic reasoning and could even cause harm. Therefore, it is important to study how clinicians use large language models to augment clinical reasoning prior to routine incorporation into patient care.

In this study, we will randomize participants to answer diagnostic cases with or without access to ChatGPT-4. The participants will be asked to give three differential diagnoses for each case, with supporting and opposing findings for each diagnosis. Additionally they will be asked to provide their top diagnosis along with next diagnostic steps. Answers will be graded by independent reviewers blinded to treatment assignment.

Enrollment

50 patients

Sex

All

Volunteers

Accepts Healthy Volunteers

Inclusion criteria

Participants must be licensed physicians and have completed at least post-graduate year 2 (PGY2) of medical training.
Training in Internal medicine, family medicine, or emergency medicine.

Exclusion criteria

Not currently practicing clinically.

Trial design

Primary purpose

Diagnostic

Allocation

Randomized

Interventional model

Parallel Assignment

Masking

Single Blind

50 participants in 2 patient groups

GPT-4

Active Comparator group

Description:

Group will be given access to GPT-4.

Treatment:

Other: GPT-4

Usual resources

No Intervention group

Description:

Group will not be given access to GPT-4 but will be encouraged to use any resources they wish besides large language models (UpToDate, Dynamed, google, etc).

Trial contacts and locations

Central trial contact

Jonathan H Chen, MD, PhD; Robert J Gallo, MD

Data sourced from clinicaltrials.gov

Clinical trials

Find clinical trials Trials by location

Research sites

Find research sites Learn about CTV for professionals

Resources

Contact CTV support

Legal

Privacy Notice Terms