Scalable Clinical Oversight of Large Language Models Via Uncertainty Triangulation (SCOUT)

National Center for Cardiovascular Diseases

Status

Not yet enrolling

Conditions

Coronary Heart Disease (CHD)

Treatments

Diagnostic Test: Standard Manual Review Workflow

Diagnostic Test: SCOUT-Assisted Review Workflow

Study type

Interventional

Funder types

Other

Identifiers

NCT07414966

2025-2702-1

Details and patient eligibility

About

This prospective, multi-reader, randomized crossover trial evaluates SCOUT (Scalable Clinical Oversight via Uncertainty Triangulation), a model-agnostic meta-verification framework that selectively defers unreliable large language model (LLM) predictions to clinicians by triangulating three orthogonal uncertainty signals: model heterogeneity, stochastic inconsistency, and reasoning critique. The trial assesses whether SCOUT-assisted review can reduce physician review time compared with standard manual review of AI-generated diagnoses while maintaining non-inferior diagnostic accuracy in coronary heart disease (CHD) subtyping.

Full description

Background: Large language models are increasingly deployed in clinical workflows, yet requiring clinician review of every AI output negates the efficiency gains that motivate their adoption. SCOUT addresses this efficiency-safety paradox through algorithmic meta-verification.

The SCOUT framework triangulates three orthogonal external signals to determine case-level uncertainty: (1) Model Heterogeneity - whether a structurally different auxiliary LLM agrees with the primary model; (2) Stochastic Inconsistency - whether repeated sampling from the same model yields divergent outputs; (3) Reasoning Critique - whether an external checker model identifies logical flaws in the chain-of-thought reasoning.

In this crossover trial, 7 clinicians of varying seniority (2 junior residents, 3 senior residents, 2 attending physicians) each review all 110 cases under both standard manual review and SCOUT-assisted review workflows. The study evaluates workflow efficiency (primary endpoint) and diagnostic accuracy (secondary endpoint).

Enrollment

7 estimated patients

Sex

All

Ages

18+ years old

Volunteers

No Healthy Volunteers

Inclusion criteria

Board-certified or in-training cardiologists at Fuwai Hospital
Spanning three experience strata: junior residents, senior residents, attending physicians

Exclusion criteria

Clinicians involved in the development or optimization of the SCOUT framework
Clinicians involved in the gold-standard adjudication process

Trial design

Primary purpose

Diagnostic

Allocation

Randomized

Interventional model

Crossover Assignment

Masking

None (Open label)

7 participants in 2 patient groups

Control (Standard Manual Review)

Active Comparator group

Description:

Physicians manually review all cases in the control set (n=54) with access to AI predictions and reasoning. No selective deferral.

Treatment:

Diagnostic Test: Standard Manual Review Workflow

Experimental (SCOUT-Assisted Review)

Experimental group

Description:

Physicians process the intervention set (n=56) through the SCOUT framework. Low-uncertainty cases are auto-accepted; high-uncertainty cases undergo physician review with full audit trail.

Treatment:

Diagnostic Test: SCOUT-Assisted Review Workflow

Trial contacts and locations

Central trial contact

Xiaojin Gao, Dr.

Data sourced from clinicaltrials.gov

Clinical trials

Find clinical trials Trials by location

Research sites

Find research sites Learn about CTV for professionals

Resources

Contact CTV support

Legal

Privacy Notice Terms